This project presents an innovative approach to improve image segmentation in computer vision. By integrating Principal Component Analysis (PCA) with K-Means clustering and CUDA parallelization, we significantly enhanced accuracies and processing speeds in image segmentation. PCA aids in efficient data handling and feature extraction, while K-Means clustering ensures precise data segmentation. CUDA parallelization leverages GPU computing for rapid execution. This method not only refines the accuracy of image segmentation but also addresses the challenges of real-time processing and large dataset management
Check cuda version. make share your cuda version is cuda_11.8
nvcc --version
install Eigen/Dense library
wget https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.bz2
https://colab.research.google.com/drive/10-OLSOj2MCg3LAly-aQxLJPygKebI2-L?usp=sharing
- Step 1: Standardize the Data
- Step 2: Compute the Covariance Matrix
- Calculate Eigenvectors and Eigenvalues
- Sort Eigenvectors by Eigenvalues
- Choose the Top k Eigenvectors
- Project the Data onto Lower-Dimensional Space
- Step 1: Choose the Number of Clusters (K)
- Step 2: Initialize Cluster Centers
- Step 3: Assign Data Points to Nearest Cluster
- Step 4: Update Cluster Centers
- Step 5: Repeat Steps 3 and 4
origional
result
origional
result
origional
result
The algorithm doesn’t work well with low color contrast objects due to the random and limited optimization of k-means.
K-means clustering faces challenges due to sensitivity to initial conditions and outliers, impacting its consistency. The choice of 'k' is crucial, influencing clustering quality. Exploring Gaussian Mixture Model (GMM) for image clustering is considered, offering better adaptability to data distribution. However, GMM's parallelization is complex due to its reliance on Expectation-Maximization. Despite challenges, GMM shows promise for improved clustering accuracy. Future efforts involve parallelizing GMM for GPU computing to enhance viability in real-time image analysis. Additionally, data prefetching will be integrated into PCA and K-means to optimize memory access, particularly beneficial for K-means' large matrix operations, speeding up processes and minimizing delays.
More details please refer to the report.pdf file