labelling_job_combiner

this is part of a test given by an employer

it can combine multiple labelling jobs to get the combined mask. In addition to that, it can show the image for a unique class and it can return performance mettric for each location.

following libraries and packages are used: PIL - to show image numpy - array formation h5py - read h5 files pyspark - map reduce tarfile - read tat.gz files

Usage: if there's a .csv file with location_id, date and classes use; python driver.py <csv_file>

to see image for a unique class for a location id python check_location.py --loc_id <location_id> --class

to get performance metric for a location python measure_performance.py --loc_id <location_id>

Assumptions:

when there are an equal number of absents and presents it should be an undefined pixel
performance metric is measured as the number of class overlaps over number of pixels with a class

Improvements:

This solution is scalable and it will be better to use a spark cluster to run this. In that manner the performance could be improved.

If more time and resources are given, the solution can have multiprocessing and distributed computing to improve the efficiency.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
output		output
README.md		README.md
check_location.py		check_location.py
driver.py		driver.py
job_combiner.py		job_combiner.py
measure_performance.py		measure_performance.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

labelling_job_combiner

Assumptions:

Improvements:

About

Releases

Packages

Languages

malanEvans/labelling_job_combiner

Folders and files

Latest commit

History

Repository files navigation

labelling_job_combiner

Assumptions:

Improvements:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages