Public Member Functions | Protected Member Functions | Protected Attributes

CKMeans Class Reference


Detailed Description

KMeans clustering, partitions the data into k (a-priori specified) clusters.

It minimizes

\[ \sum_{i=1}^k\sum_{x_j\in S_i} (x_j-\mu_i)^2 \]

where $\mu_i$ are the cluster centers and $S_i,\;i=1,\dots,k$ are the index sets of the clusters.

Beware that this algorithm obtains only a local optimum.

cf. http://en.wikipedia.org/wiki/K-means_algorithm

Definition at line 39 of file KMeans.h.

Inheritance diagram for CKMeans:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CKMeans ()
 CKMeans (int32_t k, CDistance *d)
virtual ~CKMeans ()
virtual EClassifierType get_classifier_type ()
virtual bool load (FILE *srcfile)
virtual bool save (FILE *dstfile)
void set_k (int32_t p_k)
int32_t get_k ()
void set_max_iter (int32_t iter)
float64_t get_max_iter ()
SGVector< float64_tget_radiuses ()
SGMatrix< float64_tget_cluster_centers ()
int32_t get_dimensions ()
virtual const char * get_name () const

Protected Member Functions

void clustknb (bool use_old_mus, float64_t *mus_start)
virtual bool train_machine (CFeatures *data=NULL)
virtual void store_model_features ()

Protected Attributes

int32_t max_iter
 maximum number of iterations
int32_t k
 the k parameter in KMeans
int32_t dimensions
 number of dimensions
SGVector< float64_tR
 radi of the clusters (size k)

Constructor & Destructor Documentation

CKMeans (  ) 

default constructor

Definition at line 29 of file KMeans.cpp.

CKMeans ( int32_t  k,
CDistance d 
)

constructor

Parameters:
k parameter k
d distance

Definition at line 35 of file KMeans.cpp.

~CKMeans (  )  [virtual]

Definition at line 43 of file KMeans.cpp.


Member Function Documentation

void clustknb ( bool  use_old_mus,
float64_t mus_start 
) [protected]

clustknb

Parameters:
use_old_mus if old mus shall be used
mus_start mus start

replace rhs feature vectors

set rhs to mus_start

update rhs

Definition at line 134 of file KMeans.cpp.

virtual EClassifierType get_classifier_type (  )  [virtual]

get classifier type

Returns:
classifier type KMEANS

Reimplemented from CMachine.

Definition at line 57 of file KMeans.h.

SGMatrix<float64_t> get_cluster_centers (  ) 

get centers

Definition at line 119 of file KMeans.h.

int32_t get_dimensions (  ) 

get dimensions

Returns:
number of dimensions

Definition at line 136 of file KMeans.h.

int32_t get_k (  ) 

get k

Returns:
the parameter k

Definition at line 87 of file KMeans.h.

float64_t get_max_iter (  ) 

get maximum number of iterations

Returns:
maximum number of iterations

Definition at line 106 of file KMeans.h.

virtual const char* get_name ( void   )  const [virtual]
Returns:
object name

Reimplemented from CDistanceMachine.

Definition at line 142 of file KMeans.h.

SGVector<float64_t> get_radiuses (  ) 

get radiuses

Definition at line 114 of file KMeans.h.

bool load ( FILE *  srcfile  )  [virtual]

load distance machine from file

Parameters:
srcfile file to load from
Returns:
if loading was successful

Reimplemented from CMachine.

Definition at line 73 of file KMeans.cpp.

bool save ( FILE *  dstfile  )  [virtual]

save distance machine to file

Parameters:
dstfile file to save to
Returns:
if saving was successful

Reimplemented from CMachine.

Definition at line 80 of file KMeans.cpp.

void set_k ( int32_t  p_k  ) 

set k

Parameters:
p_k new k

Definition at line 77 of file KMeans.h.

void set_max_iter ( int32_t  iter  ) 

set maximum number of iterations

Parameters:
iter the new maximum

Definition at line 96 of file KMeans.h.

void store_model_features (  )  [protected, virtual]

Ensures cluster centers are in lhs of underlying distance

Reimplemented from CDistanceMachine.

Definition at line 419 of file KMeans.cpp.

bool train_machine ( CFeatures data = NULL  )  [protected, virtual]

train k-means

Parameters:
data training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data)
Returns:
whether training was successful

Reimplemented from CMachine.

Definition at line 48 of file KMeans.cpp.


Member Data Documentation

int32_t dimensions [protected]

number of dimensions

Definition at line 176 of file KMeans.h.

int32_t k [protected]

the k parameter in KMeans

Definition at line 173 of file KMeans.h.

int32_t max_iter [protected]

maximum number of iterations

Definition at line 170 of file KMeans.h.

SGVector<float64_t> R [protected]

radi of the clusters (size k)

Definition at line 179 of file KMeans.h.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines

SHOGUN Machine Learning Toolbox - Documentation