Skip to content

Commit

Permalink
minorMerge branch 'master' of https://github.com/gittar/k-means-u-star
Browse files Browse the repository at this point in the history
  • Loading branch information
gittar committed Jul 2, 2017
2 parents b8c0f7f + 04ab49d commit 624225b
Showing 1 changed file with 13 additions and 13 deletions.
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# The k-means-u* clustering algorithm

# The k-means-u* algorithm
## non-local jumps and greedy retries improve k-means++ clustering
![GitHub Logo](notebooks/img/example.png)

This repository contains example python code for the k-means-u and k-mean-u* algorithms as proposed in https://arxiv.org/abs/1706.09059.
Expand All @@ -10,24 +10,24 @@ This repository contains example python code for the k-means-u and k-mean-u* alg
* install miniconda or anaconda: https://conda.io/docs/install/quick.html
* create kmus environment: `conda env create -f envsimple.yml`
* activate environment: `source activate kmus` (on windows: `activate kmus`)
* start one of the jupyter notebooks: `jupyter notebook algo-pure.ipynb`
* start one of the jupyter notebooks, e.g.: `jupyter notebook notebooks/algo-pure.ipynb`
* continue in the browser window which opens (jupyter manual: http://jupyter-notebook.readthedocs.io/en/latest/)

## jupyter notebooks:
* algo-pure.ipynb <br>
(a bare-bones implementation meant for easy understanding of the algorithms)
* simu-detail.ipynb <br>
(detailed simulations and graphics to illustrate the way the algrithms work, uses kmeansu.py)
* simu-bulk.ipynb <br>
(systematic simulations with various data sets to compare k-means-++, k-means-u and k-means-u*, uses kmeansu.py)
* dataset_class.ipynb<br>
(examples for using the data generator)
* [algo-pure.ipynb](https://github.com/gittar/k-means-u-star/blob/master/notebooks/algo-pure.ipynb)<br>
a bare-bones implementation meant for easy understanding of the algorithms
* [simu-detail.ipynb](https://github.com/gittar/k-means-u-star/blob/master/notebooks/simu-detail.ipynb) <br>
detailed simulations and graphics to illustrate the way the algorithms work, uses kmeansu.py
* [simu-bulk.ipynb](https://github.com/gittar/k-means-u-star/blob/master/notebooks/simu-bulk.ipynb) <br>
systematic simulations with various data sets to compare k-means-++, k-means-u and k-means-u*, uses kmeansu.py
* [dataset_class.ipynb](https://github.com/gittar/k-means-u-star/blob/master/notebooks/dataset_class.ipynb)<br>
examples for using the data generator

## python files:
* kmeansu.py <br>
main implementation of k-means-u and k-means-u*, makes heavy use of
http://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html for efficient implementations of k-means and k-means++, gathers certain statistics while training to enable systematic evaluation, code therefore a bit larger
* bfdataset.py <br>
(contains a class "dataset" to generate test data sets and also an own implementation of k-means++ which allows to get the codebook after initialization but before the run of k-means)
contains a class "dataset" to generate test data sets and also an own implementation of k-means++ which allows to access the codebook *after* initialization but *before* the run of k-means
* bfutil.py <br>
(various utility functions for plotting etc.)
various utility functions for plotting etc.

0 comments on commit 624225b

Please sign in to comment.