Public Member Functions | Protected Member Functions | Protected Attributes

CKMeans Class Reference


Detailed Description

KMeans clustering, partitions the data into k (a-priori specified) clusters.

It minimizes

\[ \sum_{i=1}^k\sum_{x_j\in S_i} (x_j-\mu_i)^2 \]

where $\mu_i$ are the cluster centers and $S_i,\;i=1,\dots,k$ are the index sets of the clusters.

Beware that this algorithm obtains only a local optimum.

cf. http://en.wikipedia.org/wiki/K-means_algorithm

Definition at line 39 of file KMeans.h.

Inheritance diagram for CKMeans:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CKMeans ()
 CKMeans (int32_t k, CDistance *d)
virtual ~CKMeans ()
virtual EClassifierType get_classifier_type ()
virtual bool train (CFeatures *data=NULL)
virtual bool load (FILE *srcfile)
virtual bool save (FILE *dstfile)
void set_k (int32_t p_k)
int32_t get_k ()
void set_max_iter (int32_t iter)
float64_t get_max_iter ()
void get_radi (float64_t *&radi, int32_t &num)
void get_centers (float64_t *&centers, int32_t &dim, int32_t &num)
void get_radiuses (float64_t **radii, int32_t *num)
void get_cluster_centers (float64_t **centers, int32_t *dim, int32_t *num)
int32_t get_dimensions ()

Protected Member Functions

void clustknb (bool use_old_mus, float64_t *mus_start)
virtual CLabelsclassify ()
virtual CLabelsclassify (CFeatures *data)
virtual const char * get_name () const

Protected Attributes

int32_t max_iter
 maximum number of iterations
int32_t k
 the k parameter in KMeans
int32_t dimensions
 number of dimensions
float64_tR
 radi of the clusters (size k)
float64_tmus
 centers of the clusters (size dimensions x k)

Constructor & Destructor Documentation

CKMeans (  ) 

default constructor

Definition at line 29 of file KMeans.cpp.

CKMeans ( int32_t  k,
CDistance d 
)

constructor

Parameters:
k parameter k
d distance

Definition at line 35 of file KMeans.cpp.

~CKMeans (  )  [virtual]

Definition at line 42 of file KMeans.cpp.


Member Function Documentation

virtual CLabels* classify (  )  [protected, virtual]

classify objects using the currently set features

Returns:
classified labels

Implements CDistanceMachine.

Definition at line 199 of file KMeans.h.

virtual CLabels* classify ( CFeatures data  )  [protected, virtual]

classify objects

Parameters:
data (test)data to be classified
Returns:
classified labels

Implements CDistanceMachine.

Definition at line 210 of file KMeans.h.

void clustknb ( bool  use_old_mus,
float64_t mus_start 
) [protected]

clustknb

Parameters:
use_old_mus if old mus shall be used
mus_start mus start

replace rhs feature vectors

set rhs to mus_start

update rhs

Definition at line 133 of file KMeans.cpp.

void get_centers ( float64_t *&  centers,
int32_t &  dim,
int32_t &  num 
)

get centers

Parameters:
centers current centers are stored in here
dim dimensions are stored in here
num number of centers is stored in here

Definition at line 138 of file KMeans.h.

virtual EClassifierType get_classifier_type (  )  [virtual]

get classifier type

Returns:
classifier type KMEANS

Reimplemented from CClassifier.

Definition at line 57 of file KMeans.h.

void get_cluster_centers ( float64_t **  centers,
int32_t *  dim,
int32_t *  num 
)

get cluster centers (swig compatible)

Parameters:
centers current cluster centers are stored in here
dim dimensions are stored in here
num number of centers is stored in here

Definition at line 166 of file KMeans.h.

int32_t get_dimensions (  ) 

get dimensions

Returns:
number of dimensions

Definition at line 182 of file KMeans.h.

int32_t get_k (  ) 

get k

Returns:
the parameter k

Definition at line 97 of file KMeans.h.

float64_t get_max_iter (  ) 

get maximum number of iterations

Returns:
maximum number of iterations

Definition at line 116 of file KMeans.h.

virtual const char* get_name ( void   )  const [protected, virtual]
Returns:
object name

Reimplemented from CDistanceMachine.

Definition at line 219 of file KMeans.h.

void get_radi ( float64_t *&  radi,
int32_t &  num 
)

get radi

Parameters:
radi current radi are stored in here
num number of radi is stored in here

Definition at line 126 of file KMeans.h.

void get_radiuses ( float64_t **  radii,
int32_t *  num 
)

get radiuses (swig compatible)

Parameters:
radii current radiuses are stored in here
num number of radiuses is stored in here

Definition at line 150 of file KMeans.h.

bool load ( FILE *  srcfile  )  [virtual]

load distance machine from file

Parameters:
srcfile file to load from
Returns:
if loading was successful

Reimplemented from CClassifier.

Definition at line 72 of file KMeans.cpp.

bool save ( FILE *  dstfile  )  [virtual]

save distance machine to file

Parameters:
dstfile file to save to
Returns:
if saving was successful

Reimplemented from CClassifier.

Definition at line 79 of file KMeans.cpp.

void set_k ( int32_t  p_k  ) 

set k

Parameters:
p_k new k

Definition at line 87 of file KMeans.h.

void set_max_iter ( int32_t  iter  ) 

set maximum number of iterations

Parameters:
iter the new maximum

Definition at line 106 of file KMeans.h.

bool train ( CFeatures data = NULL  )  [virtual]

train k-means

Parameters:
data training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data)
Returns:
whether training was successful

Reimplemented from CClassifier.

Definition at line 48 of file KMeans.cpp.


Member Data Documentation

int32_t dimensions [protected]

number of dimensions

Definition at line 229 of file KMeans.h.

int32_t k [protected]

the k parameter in KMeans

Definition at line 226 of file KMeans.h.

int32_t max_iter [protected]

maximum number of iterations

Definition at line 223 of file KMeans.h.

float64_t* mus [protected]

centers of the clusters (size dimensions x k)

Definition at line 235 of file KMeans.h.

float64_t* R [protected]

radi of the clusters (size k)

Definition at line 232 of file KMeans.h.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines

SHOGUN Machine Learning Toolbox - Documentation