Class KNN, an implementation of the standard k-nearest neigbor classifier.
An example is classified to belong to the class of which the majority of the k closest examples belong to. Formally, kNN is described as
This class provides a capability to do weighted classfication using:
where .
To avoid ties, k should be an odd number. To define how close examples are k-NN requires a CDistance object to work with (e.g., CEuclideanDistance ).
Note that k-NN has zero training time but classification times increase dramatically with the number of examples. Also note that k-NN is capable of multi-class-classification. And finally, in case of k=1 classification will take less time with an special optimization provided.
Definition at line 53 of file KNN.h.
Public Member Functions | |
CKNN () | |
CKNN (int32_t k, CDistance *d, CLabels *trainlab) | |
virtual | ~CKNN () |
virtual EClassifierType | get_classifier_type () |
virtual CLabels * | apply () |
virtual CLabels * | apply (CFeatures *data) |
virtual float64_t | apply (int32_t vec_idx) |
get output for example "vec_idx" | |
SGMatrix< int32_t > | classify_for_multiple_k () |
virtual bool | load (FILE *srcfile) |
virtual bool | save (FILE *dstfile) |
void | set_k (int32_t k) |
int32_t | get_k () |
void | set_q (float64_t q) |
float64_t | get_q () |
virtual const char * | get_name () const |
Protected Member Functions | |
virtual void | store_model_features () |
virtual CLabels * | classify_NN () |
void | init_distance (CFeatures *data) |
virtual bool | train_machine (CFeatures *data=NULL) |
Protected Attributes | |
int32_t | m_k |
the k parameter in KNN | |
float64_t | m_q |
parameter q of rank weighting | |
int32_t | num_classes |
number of classes (i.e. number of values labels can take) | |
int32_t | min_label |
smallest label, i.e. -1 | |
SGVector< int32_t > | train_labels |
CLabels * apply | ( | ) | [virtual] |
classify all examples
histogram of classes and returned output
Reimplemented from CDistanceMachine.
virtual float64_t apply | ( | int32_t | vec_idx | ) | [virtual] |
get output for example "vec_idx"
Reimplemented from CDistanceMachine.
classify objects
data | (test)data to be classified |
Reimplemented from CDistanceMachine.
SGMatrix< int32_t > classify_for_multiple_k | ( | ) |
CLabels * classify_NN | ( | ) | [protected, virtual] |
virtual EClassifierType get_classifier_type | ( | ) | [virtual] |
virtual const char* get_name | ( | ) | const [virtual] |
void init_distance | ( | CFeatures * | data | ) | [protected] |
bool load | ( | FILE * | srcfile | ) | [virtual] |
bool save | ( | FILE * | dstfile | ) | [virtual] |
void store_model_features | ( | ) | [protected, virtual] |
Stores feature data of underlying model.
Replaces lhs and rhs of underlying distance with copies of themselves
Reimplemented from CDistanceMachine.
bool train_machine | ( | CFeatures * | data = NULL |
) | [protected, virtual] |
int32_t num_classes [protected] |
SGVector<int32_t> train_labels [protected] |