GitHub - ai-ku/wkmeans: k-means algorithm with (optional) instance weights.

ai-ku / wkmeans Public

Notifications You must be signed in to change notification settings
Fork 1
Star 15

k-means algorithm with (optional) instance weights.

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README		README
wkmeans.c		wkmeans.c
wkmeans.h		wkmeans.h

Repository files navigation

WKMEANS		            Copyright (c) 2012, Deniz Yuret

Usage: wkmeans [options] < input > output
  -k number of clusters (default 2)
  -r number of restarts (default 0)
  -s random seed
  -l input file contains labels
  -w input file contains instance weights
  -v verbose output
  
Input format (assuming you have m vectors of n dimensions):
[label_1] [weight_1] x_11 ... x_1m
...
[label_m] [weight_m] x_m1 ... x_mn

label_i  : (string) label of the ith vector, required when -l used
weight_i : (double) weight of the ith vector, required when -w used
x_ij     : (double) ith vector's jth component

Output format:
[label_1] c_1
...
[label_m] c_m

c_i : (int) cluster of ith vector


Algorithm: wkmeans is a k-means algorithm with (optional) instance
weights.

* Based on mpi_kmeans-1.5 by Peter Gehler.

* Based on C. Elkan. Using the triangle inequality to accelerate
  kMeans. ICML 2003.

* Initialization based on Arthur, D. and Vassilvitskii,
  S. (2007). K-means++: the advantages of careful seeding.
  Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete
  algorithms. pp. 1027-1035.

Please see the file LICENSE for terms of use.  Everything is standard
C, so just typing make should give you an executable.