k-Means clustering is an unsupervised machine learning algorithm that seeks to segment a dataset into groups based on the similarity of datapoints. An unsupervised model has independent variables and no dependent variables. In this project, I break down the basic idea of it, just simple to understand how it works.
-
Python: Version 3.10
-
NumPy: Version 1.23.0
-
Scipy: Version 1.9.1
-
Matplotlib: Version 3.5.3
-
Spyder IDE: Version 5.3.2
-
I implemented here an algorithm from scratch to apply clustring to some dataset. There are 2 main points we need to know.
-
First, choosing some random points to start with it as initial centroids, pick any 3 centroids from the dataset.
-
Second, find the nearest centroid for every point, then assign it to it's centroid.
-
Then, we try to centralize the new centroids by finding the shortest path between all of them.
-
Finally, Repeat!
-
For more details... Please check the References.
Contributions are what makes the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Do not forget to give the project a star! Thanks again!
Distributed under the MIT License. See LICENSE.txt
for more information.
- This is an important video
- Via Email : [email protected]
- Via FaceBook.