Agglomerative hierarchical single linkage clustering.
Starting with each object being assigned to its own cluster clusters are iteratively merged. Here the clusters are merged whose elements have minimum distance, i.e. the clusters A and B that obtain
are merged.
cf e.g. http://en.wikipedia.org/wiki/Data_clustering
Definition at line 37 of file Hierarchical.h.
Public Member Functions | |
CHierarchical () | |
CHierarchical (int32_t merges, CDistance *d) | |
virtual | ~CHierarchical () |
virtual EClassifierType | get_classifier_type () |
virtual bool | load (FILE *srcfile) |
virtual bool | save (FILE *dstfile) |
void | set_merges (int32_t m) |
int32_t | get_merges () |
SGVector< int32_t > | get_assignment () |
SGVector< float64_t > | get_merge_distances () |
SGMatrix< int32_t > | get_cluster_pairs () |
virtual const char * | get_name () const |
Protected Member Functions | |
virtual bool | train_machine (CFeatures *data=NULL) |
virtual void | store_model_features () |
virtual CLabels * | apply () |
virtual CLabels * | apply (CFeatures *data) |
virtual float64_t | apply (int32_t num) |
Protected Attributes | |
int32_t | merges |
the number of merges in hierarchical clustering | |
int32_t | dimensions |
number of dimensions | |
int32_t | assignment_size |
size of assignment table | |
int32_t * | assignment |
cluster assignment for the num_points | |
int32_t | table_size |
size of the below tables | |
int32_t * | pairs |
tuples of i/j | |
float64_t * | merge_distance |
distance at which pair i/j was added |
CHierarchical | ( | ) |
default constructor
Definition at line 34 of file Hierarchical.cpp.
CHierarchical | ( | int32_t | merges, | |
CDistance * | d | |||
) |
constructor
merges | the merges | |
d | distance |
Definition at line 40 of file Hierarchical.cpp.
~CHierarchical | ( | ) | [virtual] |
Definition at line 47 of file Hierarchical.cpp.
CLabels * apply | ( | ) | [protected, virtual] |
NOT IMPLEMENTED
Reimplemented from CDistanceMachine.
Definition at line 177 of file Hierarchical.cpp.
NOT IMPLEMENTED
Reimplemented from CDistanceMachine.
Definition at line 172 of file Hierarchical.cpp.
float64_t apply | ( | int32_t | num | ) | [protected, virtual] |
NOT IMPLEMENTED
Reimplemented from CDistanceMachine.
Definition at line 183 of file Hierarchical.cpp.
SGVector<int32_t> get_assignment | ( | ) |
get assignment
Definition at line 93 of file Hierarchical.h.
virtual EClassifierType get_classifier_type | ( | ) | [virtual] |
get classifier type
Reimplemented from CMachine.
Definition at line 55 of file Hierarchical.h.
SGMatrix<int32_t> get_cluster_pairs | ( | ) |
get cluster pairs
Definition at line 109 of file Hierarchical.h.
get merge distance
Definition at line 101 of file Hierarchical.h.
int32_t get_merges | ( | ) |
virtual const char* get_name | ( | void | ) | const [virtual] |
Reimplemented from CDistanceMachine.
Definition at line 115 of file Hierarchical.h.
bool load | ( | FILE * | srcfile | ) | [virtual] |
load distance machine from file
srcfile | file to load from |
Reimplemented from CMachine.
Definition at line 153 of file Hierarchical.cpp.
bool save | ( | FILE * | dstfile | ) | [virtual] |
save distance machine to file
dstfile | file to save to |
Reimplemented from CMachine.
Definition at line 160 of file Hierarchical.cpp.
void set_merges | ( | int32_t | m | ) |
void store_model_features | ( | ) | [protected, virtual] |
TODO: Ensures cluster centers are in lhs of underlying distance Currently: does nothing.
Reimplemented from CDistanceMachine.
Definition at line 167 of file Hierarchical.cpp.
bool train_machine | ( | CFeatures * | data = NULL |
) | [protected, virtual] |
estimate hierarchical clustering
data | training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data) |
Reimplemented from CMachine.
Definition at line 54 of file Hierarchical.cpp.
int32_t* assignment [protected] |
cluster assignment for the num_points
Definition at line 153 of file Hierarchical.h.
int32_t assignment_size [protected] |
size of assignment table
Definition at line 150 of file Hierarchical.h.
int32_t dimensions [protected] |
number of dimensions
Definition at line 147 of file Hierarchical.h.
float64_t* merge_distance [protected] |
distance at which pair i/j was added
Definition at line 162 of file Hierarchical.h.
int32_t merges [protected] |
the number of merges in hierarchical clustering
Definition at line 144 of file Hierarchical.h.
int32_t* pairs [protected] |
tuples of i/j
Definition at line 159 of file Hierarchical.h.
int32_t table_size [protected] |
size of the below tables
Definition at line 156 of file Hierarchical.h.