KMeans clustering, partitions the data into k (a-priori specified) clusters.
It minimizes
where
are the cluster centers and
are the index sets of the clusters.
Beware that this algorithm obtains only a local optimum.
cf. http://en.wikipedia.org/wiki/K-means_algorithm
Definition at line 39 of file KMeans.h.

Public Member Functions | |
| CKMeans () | |
| CKMeans (int32_t k, CDistance *d) | |
| virtual | ~CKMeans () |
| virtual EClassifierType | get_classifier_type () |
| virtual bool | load (FILE *srcfile) |
| virtual bool | save (FILE *dstfile) |
| void | set_k (int32_t p_k) |
| int32_t | get_k () |
| void | set_max_iter (int32_t iter) |
| float64_t | get_max_iter () |
| SGVector< float64_t > | get_radiuses () |
| SGMatrix< float64_t > | get_cluster_centers () |
| int32_t | get_dimensions () |
| virtual const char * | get_name () const |
Protected Member Functions | |
| void | clustknb (bool use_old_mus, float64_t *mus_start) |
| virtual bool | train_machine (CFeatures *data=NULL) |
| virtual void | store_model_features () |
Protected Attributes | |
| int32_t | max_iter |
| maximum number of iterations | |
| int32_t | k |
| the k parameter in KMeans | |
| int32_t | dimensions |
| number of dimensions | |
| SGVector< float64_t > | R |
| radi of the clusters (size k) | |
| CKMeans | ( | ) |
default constructor
Definition at line 29 of file KMeans.cpp.
| ~CKMeans | ( | ) | [virtual] |
Definition at line 43 of file KMeans.cpp.
| void clustknb | ( | bool | use_old_mus, | |
| float64_t * | mus_start | |||
| ) | [protected] |
clustknb
| use_old_mus | if old mus shall be used | |
| mus_start | mus start |
replace rhs feature vectors
set rhs to mus_start
update rhs
Definition at line 134 of file KMeans.cpp.
| virtual EClassifierType get_classifier_type | ( | ) | [virtual] |
| int32_t get_dimensions | ( | ) |
| float64_t get_max_iter | ( | ) |
| virtual const char* get_name | ( | void | ) | const [virtual] |
| bool load | ( | FILE * | srcfile | ) | [virtual] |
load distance machine from file
| srcfile | file to load from |
Reimplemented from CMachine.
Definition at line 73 of file KMeans.cpp.
| bool save | ( | FILE * | dstfile | ) | [virtual] |
save distance machine to file
| dstfile | file to save to |
Reimplemented from CMachine.
Definition at line 80 of file KMeans.cpp.
| void set_max_iter | ( | int32_t | iter | ) |
| void store_model_features | ( | ) | [protected, virtual] |
Ensures cluster centers are in lhs of underlying distance
Reimplemented from CDistanceMachine.
Definition at line 419 of file KMeans.cpp.
| bool train_machine | ( | CFeatures * | data = NULL |
) | [protected, virtual] |
train k-means
| data | training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data) |
Reimplemented from CMachine.
Definition at line 48 of file KMeans.cpp.
int32_t dimensions [protected] |