This is a repository for a survey submitted as the final "project" for CS 270: Combinatorial Algorithms and Data Structures.
This survey focuses on unsupervised machine learning algorithms, and specifically on their use in problems related to large datasets as found in modern problems across fields including computer vision, text processing, and malware detection. The difficulty of obtaining accurate ground truth in these areas, where a labeling expert is either too expensive or simply does not exists, makes unsupervised learning algorithms particularly attractive.
In this survey, we will particularly analyze spectral clustering and crowdsourced labelling.