Public Member Functions | Protected Member Functions | Protected Attributes

CSimpleFeatures< ST > Class Template Reference


Detailed Description

template<class ST>
class shogun::CSimpleFeatures< ST >

The class SimpleFeatures implements dense feature matrices.

The feature matrices are stored en-block in memory in fortran order, i.e. column-by-column, where a column denotes a feature vector.

There are get_num_vectors() many feature vectors, of dimension get_num_features(). To access a feature vector call get_feature_vector() and when you are done treating it call free_feature_vector(). While free_feature_vector() is a NOP in most cases feature vectors might have been generated on the fly (due to a number preprocessors being attached to them).

From this template class a number the following dense feature matrix types are used and supported:

Definition at line 65 of file SimpleFeatures.h.

Inheritance diagram for CSimpleFeatures< ST >:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CSimpleFeatures (int32_t size=0)
 CSimpleFeatures (const CSimpleFeatures &orig)
 CSimpleFeatures (SGMatrix< ST > matrix)
 CSimpleFeatures (ST *src, int32_t num_feat, int32_t num_vec)
 CSimpleFeatures (CFile *loader)
virtual CFeaturesduplicate () const
virtual ~CSimpleFeatures ()
void free_feature_matrix ()
void free_features ()
ST * get_feature_vector (int32_t num, int32_t &len, bool &dofree)
void set_feature_vector (SGVector< ST > vector, int32_t num)
SGVector< ST > get_feature_vector (int32_t num)
void free_feature_vector (ST *feat_vec, int32_t num, bool dofree)
void free_feature_vector (SGVector< ST > vec, int32_t num)
void vector_subset (int32_t *idx, int32_t idx_len)
void feature_subset (int32_t *idx, int32_t idx_len)
void get_feature_matrix (ST **dst, int32_t *num_feat, int32_t *num_vec)
SGMatrix< ST > get_feature_matrix ()
void set_feature_matrix (SGMatrix< ST > matrix)
ST * get_feature_matrix (int32_t &num_feat, int32_t &num_vec)
CSimpleFeatures< ST > * get_transposed ()
ST * get_transposed (int32_t &num_feat, int32_t &num_vec)
virtual void set_feature_matrix (ST *fm, int32_t num_feat, int32_t num_vec)
virtual void copy_feature_matrix (SGMatrix< ST > src)
void obtain_from_dot (CDotFeatures *df)
virtual bool apply_preprocessor (bool force_preprocessing=false)
virtual int32_t get_size ()
virtual int32_t get_num_vectors () const
int32_t get_num_features ()
void set_num_features (int32_t num)
void set_num_vectors (int32_t num)
void initialize_cache ()
virtual EFeatureClass get_feature_class ()
virtual EFeatureType get_feature_type ()
virtual bool reshape (int32_t p_num_features, int32_t p_num_vectors)
virtual int32_t get_dim_feature_space () const
virtual float64_t dot (int32_t vec_idx1, CDotFeatures *df, int32_t vec_idx2)
virtual float64_t dense_dot (int32_t vec_idx1, const float64_t *vec2, int32_t vec2_len)
virtual void add_to_dense_vec (float64_t alpha, int32_t vec_idx1, float64_t *vec2, int32_t vec2_len, bool abs_val=false)
virtual int32_t get_nnz_features_for_vector (int32_t num)
virtual bool Align_char_features (CStringFeatures< char > *cf, CStringFeatures< char > *Ref, float64_t gapCost)
virtual void load (CFile *loader)
virtual void save (CFile *saver)
virtual void * get_feature_iterator (int32_t vector_index)
virtual bool get_next_feature (int32_t &index, float64_t &value, void *iterator)
virtual void free_feature_iterator (void *iterator)
virtual CFeaturescopy_subset (SGVector< index_t > indices)
virtual const char * get_name () const

Protected Member Functions

virtual ST * compute_feature_vector (int32_t num, int32_t &len, ST *target=NULL)

Protected Attributes

int32_t num_vectors
 number of vectors in cache
int32_t num_features
 number of features in cache
ST * feature_matrix
int32_t feature_matrix_num_vectors
int32_t feature_matrix_num_features
CCache< ST > * feature_cache

Constructor & Destructor Documentation

CSimpleFeatures ( int32_t  size = 0  ) 

constructor

Parameters:
size cache size

Definition at line 72 of file SimpleFeatures.h.

CSimpleFeatures ( const CSimpleFeatures< ST > &  orig  ) 

copy constructor

Definition at line 75 of file SimpleFeatures.h.

CSimpleFeatures ( SGMatrix< ST >  matrix  ) 

constructor

Parameters:
matrix feature matrix

Definition at line 89 of file SimpleFeatures.h.

CSimpleFeatures ( ST *  src,
int32_t  num_feat,
int32_t  num_vec 
)

constructor

Parameters:
src feature matrix
num_feat number of features in matrix
num_vec number of vectors in matrix

Definition at line 102 of file SimpleFeatures.h.

CSimpleFeatures ( CFile loader  ) 

constructor loading features from file

Parameters:
loader File object via which to load data

Definition at line 113 of file SimpleFeatures.h.

virtual ~CSimpleFeatures (  )  [virtual]

Definition at line 129 of file SimpleFeatures.h.


Member Function Documentation

virtual void add_to_dense_vec ( float64_t  alpha,
int32_t  vec_idx1,
float64_t vec2,
int32_t  vec2_len,
bool  abs_val = false 
) [virtual]

add vector 1 multiplied with alpha to dense vector2

possible with subset

Parameters:
alpha scalar alpha
vec_idx1 index of first vector
vec2 pointer to real valued vector
vec2_len length of real valued vector
abs_val if true add the absolute value

Implements CDotFeatures.

Definition at line 824 of file SimpleFeatures.h.

virtual bool Align_char_features ( CStringFeatures< char > *  cf,
CStringFeatures< char > *  Ref,
float64_t  gapCost 
) [virtual]

align char features

Parameters:
cf char features
Ref other char features
gapCost gap cost
Returns:
if aligning was successful

Definition at line 867 of file SimpleFeatures.h.

virtual bool apply_preprocessor ( bool  force_preprocessing = false  )  [virtual]

apply preprocessor

applies preprocessors to ALL features (subset removed before and restored afterwards)

not possible with subset

Parameters:
force_preprocessing if preprocssing shall be forced
Returns:
if applying was successful

Definition at line 623 of file SimpleFeatures.h.

virtual ST* compute_feature_vector ( int32_t  num,
int32_t &  len,
ST *  target = NULL 
) [protected, virtual]

compute feature vector for sample num if target is set the vector is written to target len is returned by reference

NOT IMPLEMENTED!

Parameters:
num num
len len
target 
Returns:
feature vector

Reimplemented in CFKFeatures, CRealFileFeatures, and CTOPFeatures.

Definition at line 1008 of file SimpleFeatures.h.

virtual void copy_feature_matrix ( SGMatrix< ST >  src  )  [virtual]

copy feature matrix store copy of feature_matrix, where num_features is the column offset, and columns are linear in memory see below for definition of feature_matrix

not possible with subset

Parameters:
src feature matrix to copy

Definition at line 559 of file SimpleFeatures.h.

virtual CFeatures* copy_subset ( SGVector< index_t indices  )  [virtual]

Creates a new CFeatures instance containing copies of the elements which are specified by the provided indices.

possible with subset

Parameters:
indices indices of feature elements to copy
Returns:
new CFeatures instance with copies of feature data

Reimplemented from CFeatures.

Definition at line 978 of file SimpleFeatures.h.

virtual float64_t dense_dot ( int32_t  vec_idx1,
const float64_t vec2,
int32_t  vec2_len 
) [virtual]

compute dot product between vector1 and a dense vector

possible with subset TODO: where?

Parameters:
vec_idx1 index of first vector
vec2 pointer to real valued vector
vec2_len length of real valued vector

Implements CDotFeatures.

virtual float64_t dot ( int32_t  vec_idx1,
CDotFeatures df,
int32_t  vec_idx2 
) [virtual]

compute dot product between vector1 and vector2, appointed by their indices

possible with subset

Parameters:
vec_idx1 index of first vector
df DotFeatures (of same kind) to compute dot product with
vec_idx2 index of second vector

Implements CDotFeatures.

Definition at line 781 of file SimpleFeatures.h.

virtual CFeatures* duplicate (  )  const [virtual]

duplicate feature object

Returns:
feature object

Implements CFeatures.

Definition at line 124 of file SimpleFeatures.h.

void feature_subset ( int32_t *  idx,
int32_t  idx_len 
)

Extracts the features mentioned in idx and replaces them in feature matrix in place.

It does not resize the allocated memory block.

Not possible with subset.

Parameters:
idx index with features that shall remain in the feature matrix
idx_len length of the index

Note: assumes idx is sorted

Definition at line 365 of file SimpleFeatures.h.

virtual void free_feature_iterator ( void *  iterator  )  [virtual]

clean up iterator call this function with the iterator returned by get_first_feature

Parameters:
iterator as returned by get_first_feature

Implements CDotFeatures.

Definition at line 960 of file SimpleFeatures.h.

void free_feature_matrix (  ) 

free feature matrix

Any subset is removed

Definition at line 135 of file SimpleFeatures.h.

void free_feature_vector ( SGVector< ST >  vec,
int32_t  num 
)

free feature vector

possible with subset

Parameters:
vec feature vector to free
num index in feature cache

Definition at line 303 of file SimpleFeatures.h.

void free_feature_vector ( ST *  feat_vec,
int32_t  num,
bool  dofree 
)

free feature vector

possible with subset

Parameters:
feat_vec feature vector to free
num index in feature cache
dofree if vector should be really deleted

Definition at line 287 of file SimpleFeatures.h.

void free_features (  ) 

free feature matrix and cache

Any subset is removed

Definition at line 150 of file SimpleFeatures.h.

virtual int32_t get_dim_feature_space (  )  const [virtual]

obtain the dimensionality of the feature space

(not mix this up with the dimensionality of the input space, usually obtained via get_num_features())

Returns:
dimensionality

Implements CDotFeatures.

Definition at line 770 of file SimpleFeatures.h.

virtual EFeatureClass get_feature_class (  )  [virtual]

get feature class

Returns:
feature class SIMPLE

Implements CFeatures.

Definition at line 732 of file SimpleFeatures.h.

virtual void* get_feature_iterator ( int32_t  vector_index  )  [virtual]

iterate over the non-zero features

call get_feature_iterator first, followed by get_next_feature and free_feature_iterator to cleanup

possible with subset

Parameters:
vector_index the index of the vector over whose components to iterate over
Returns:
feature iterator (to be passed to get_next_feature)

Implements CDotFeatures.

Definition at line 914 of file SimpleFeatures.h.

void get_feature_matrix ( ST **  dst,
int32_t *  num_feat,
int32_t *  num_vec 
)

get a copy of the feature matrix num_feat,num_vectors are returned by reference

possible with subset

Parameters:
dst destination to store matrix in
num_feat number of features (rows of matrix)
num_vec number of vectors (columns of matrix)

Definition at line 404 of file SimpleFeatures.h.

SGMatrix<ST> get_feature_matrix (  ) 

Getter for feature matrix

subset is ignored

Returns:
matrix feature matrix

Definition at line 437 of file SimpleFeatures.h.

ST* get_feature_matrix ( int32_t &  num_feat,
int32_t &  num_vec 
)

get the pointer to the feature matrix num_feat,num_vectors are returned by reference

subset is ignored

Parameters:
num_feat number of features in matrix
num_vec number of vectors in matrix
Returns:
feature matrix

Definition at line 468 of file SimpleFeatures.h.

virtual EFeatureType get_feature_type (  )  [virtual]

get feature type

Returns:
templated feature type

Implements CFeatures.

ST* get_feature_vector ( int32_t  num,
int32_t &  len,
bool &  dofree 
)

get feature vector for sample num from the matrix as it is if matrix is initialized, else return preprocessed compute_feature_vector (not implemented)

Parameters:
num index of feature vector
len length is returned by reference
dofree whether returned vector must be freed by caller via free_feature_vector
Returns:
feature vector

Definition at line 168 of file SimpleFeatures.h.

SGVector<ST> get_feature_vector ( int32_t  num  ) 

get feature vector num

possible with subset

Parameters:
num index of vector
Returns:
feature vector

Definition at line 263 of file SimpleFeatures.h.

virtual const char* get_name ( void   )  const [virtual]
Returns:
object name

Implements CSGObject.

Reimplemented in CFKFeatures, CRealFileFeatures, and CTOPFeatures.

Definition at line 994 of file SimpleFeatures.h.

virtual bool get_next_feature ( int32_t &  index,
float64_t value,
void *  iterator 
) [virtual]

iterate over the non-zero features

call this function with the iterator returned by get_first_feature and call free_feature_iterator to cleanup

possible with subset

Parameters:
index is returned by reference (-1 when not available)
value is returned by reference
iterator as returned by get_first_feature
Returns:
true if a new non-zero feature got returned

Implements CDotFeatures.

Definition at line 942 of file SimpleFeatures.h.

virtual int32_t get_nnz_features_for_vector ( int32_t  num  )  [virtual]

get number of non-zero features in vector

Parameters:
num which vector
Returns:
number of non-zero features in vector

Implements CDotFeatures.

Definition at line 854 of file SimpleFeatures.h.

int32_t get_num_features (  ) 

get number of features (of possible subset)

Returns:
number of features

Definition at line 683 of file SimpleFeatures.h.

virtual int32_t get_num_vectors (  )  const [virtual]

get number of feature vectors

Returns:
number of feature vectors

Implements CFeatures.

Definition at line 674 of file SimpleFeatures.h.

virtual int32_t get_size (  )  [virtual]

get memory footprint of one feature

Returns:
memory footprint of one feature

Implements CFeatures.

Definition at line 668 of file SimpleFeatures.h.

ST* get_transposed ( int32_t &  num_feat,
int32_t &  num_vec 
)

compute and return the transpose of the feature matrix which will be prepocessed. num_feat, num_vectors are returned by reference caller has to clean up

possible with subset

Parameters:
num_feat number of features in matrix
num_vec number of vectors in matrix
Returns:
transposed sparse feature matrix

Definition at line 501 of file SimpleFeatures.h.

CSimpleFeatures<ST>* get_transposed (  ) 

get a transposed copy of the features

possible with subset

Returns:
transposed copy

Definition at line 481 of file SimpleFeatures.h.

void initialize_cache (  ) 

Initialize cache

not possible with subset

Definition at line 714 of file SimpleFeatures.h.

virtual void load ( CFile loader  )  [virtual]

load features from file

Parameters:
loader File object via which to load data

Reimplemented from CFeatures.

void obtain_from_dot ( CDotFeatures df  ) 

obtain simple features from other dotfeatures

removes any subset before

Parameters:
df dotfeatures to obtain features from

Definition at line 585 of file SimpleFeatures.h.

virtual bool reshape ( int32_t  p_num_features,
int32_t  p_num_vectors 
) [virtual]

reshape

not possible with subset

Parameters:
p_num_features new number of features
p_num_vectors new number of vectors
Returns:
if reshaping was successful

Reimplemented from CFeatures.

Definition at line 748 of file SimpleFeatures.h.

virtual void save ( CFile saver  )  [virtual]

save features to file

Parameters:
saver File object via which to save data

Reimplemented from CFeatures.

void set_feature_matrix ( SGMatrix< ST >  matrix  ) 

Setter for feature matrix

any subset is removed

Parameters:
matrix feature matrix to set

Definition at line 448 of file SimpleFeatures.h.

virtual void set_feature_matrix ( ST *  fm,
int32_t  num_feat,
int32_t  num_vec 
) [virtual]

set feature matrix necessary to set feature_matrix, num_features, num_vectors, where num_features is the column offset, and columns are linear in memory see below for definition of feature_matrix

not possible with subset

Parameters:
fm feature matrix to se
num_feat number of features in matrix
num_vec number of vectors in matrix

Definition at line 535 of file SimpleFeatures.h.

void set_feature_vector ( SGVector< ST >  vector,
int32_t  num 
)

set feature vector num

possible with subset

Parameters:
vector vector
num index if vector to set

Definition at line 234 of file SimpleFeatures.h.

void set_num_features ( int32_t  num  ) 

set number of features

Parameters:
num number to set

Definition at line 689 of file SimpleFeatures.h.

void set_num_vectors ( int32_t  num  ) 

set number of vectors

not possible with subset

Parameters:
num number to set

Definition at line 701 of file SimpleFeatures.h.

void vector_subset ( int32_t *  idx,
int32_t  idx_len 
)

Extracts the feature vectors mentioned in idx and replaces them in feature matrix in place.

It does not resize the allocated memory block.

not possible with subset

Parameters:
idx index with examples that shall remain in the feature matrix
idx_len length of the index

Note: assumes idx is sorted

Definition at line 321 of file SimpleFeatures.h.


Member Data Documentation

CCache<ST>* feature_cache [protected]

feature cache

Definition at line 1058 of file SimpleFeatures.h.

ST* feature_matrix [protected]

Feature matrix and its associated number of vectors and features. Note that num_vectors / num_features above have the same sizes if feature_matrix != NULL

Definition at line 1049 of file SimpleFeatures.h.

int32_t feature_matrix_num_features [protected]

number of features in feature matrix

Definition at line 1055 of file SimpleFeatures.h.

int32_t feature_matrix_num_vectors [protected]

number of vectors in feature matrix

Definition at line 1052 of file SimpleFeatures.h.

int32_t num_features [protected]

number of features in cache

Definition at line 1043 of file SimpleFeatures.h.

int32_t num_vectors [protected]

number of vectors in cache

Definition at line 1040 of file SimpleFeatures.h.


The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines

SHOGUN Machine Learning Toolbox - Documentation