Public Member Functions | Static Public Member Functions | Public Attributes | Protected Member Functions | Protected Attributes

CStringFeatures< ST > Class Template Reference


Detailed Description

template<class ST>
class shogun::CStringFeatures< ST >

Template class StringFeatures implements a list of strings.

As this class is a template the underlying storage type is quite arbitrary and not limited to character strings, but could also be sequences of floating point numbers etc. Strings differ from matrices (cf. CDenseFeatures) in a way that the dimensionality of the feature vectors (i.e. the strings) is not fixed; it may vary between strings.

Most string kernels require StringFeatures but a number of them actually requires strings to have same length.

When preprocessors are attached to string features they may shorten the string, but are not allowed to return strings longer than max_string_length, as some algorithms depend on this.

Also note that string features cannot currently be computed on-the-fly.

(Partly) subset access is supported for this feature type. Simple use the (inherited) add_subset(), remove_subset() functions. If done, all calls that work with features are translated to the subset. See comments to find out whether it is supported for that method. See also CFeatures class documentation

Definition at line 72 of file StringFeatures.h.

Inheritance diagram for CStringFeatures< ST >:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CStringFeatures ()
 CStringFeatures (EAlphabet alpha)
 CStringFeatures (SGStringList< ST > string_list, EAlphabet alpha)
 CStringFeatures (SGStringList< ST > string_list, CAlphabet *alpha)
 CStringFeatures (CAlphabet *alpha)
 CStringFeatures (const CStringFeatures &orig)
 CStringFeatures (CFile *loader, EAlphabet alpha=DNA)
virtual ~CStringFeatures ()
virtual void cleanup ()
virtual void cleanup_feature_vector (int32_t num)
virtual void cleanup_feature_vectors (int32_t start, int32_t stop)
virtual EFeatureClass get_feature_class () const
virtual EFeatureType get_feature_type () const
CAlphabetget_alphabet ()
virtual CFeaturesduplicate () const
SGVector< ST > get_feature_vector (int32_t num)
void set_feature_vector (SGVector< ST > vector, int32_t num)
void enable_on_the_fly_preprocessing ()
void disable_on_the_fly_preprocessing ()
ST * get_feature_vector (int32_t num, int32_t &len, bool &dofree)
CStringFeatures< ST > * get_transposed ()
SGString< ST > * get_transposed (int32_t &num_feat, int32_t &num_vec)
void free_feature_vector (ST *feat_vec, int32_t num, bool dofree)
void free_feature_vector (SGVector< ST > feat_vec, int32_t num)
virtual ST get_feature (int32_t vec_num, int32_t feat_num)
virtual int32_t get_vector_length (int32_t vec_num)
virtual int32_t get_max_vector_length ()
virtual int32_t get_num_vectors () const
floatmax_t get_num_symbols ()
floatmax_t get_max_num_symbols ()
floatmax_t get_original_num_symbols ()
int32_t get_order ()
ST get_masked_symbols (ST symbol, uint8_t mask)
ST shift_offset (ST offset, int32_t amount)
ST shift_symbol (ST symbol, int32_t amount)
virtual void load (CFile *loader)
void load_ascii_file (char *fname, bool remap_to_bin=true, EAlphabet ascii_alphabet=DNA, EAlphabet binary_alphabet=RAWDNA)
bool load_fasta_file (const char *fname, bool ignore_invalid=false)
bool load_fastq_file (const char *fname, bool ignore_invalid=false, bool bitremap_in_single_string=false)
bool load_from_directory (char *dirname)
void set_features (SGStringList< ST > feats)
bool set_features (SGString< ST > *p_features, int32_t p_num_vectors, int32_t p_max_string_length)
bool append_features (CStringFeatures< ST > *sf)
bool append_features (SGString< ST > *p_features, int32_t p_num_vectors, int32_t p_max_string_length)
SGStringList< ST > get_features ()
virtual SGString< ST > * get_features (int32_t &num_str, int32_t &max_str_len)
virtual SGString< ST > * copy_features (int32_t &num_str, int32_t &max_str_len)
virtual void get_features (SGString< ST > **dst, int32_t *num_str)
virtual void save (CFile *writer)
virtual bool load_compressed (char *src, bool decompress)
virtual bool save_compressed (char *dest, E_COMPRESSION_TYPE compression, int level)
virtual int32_t get_size () const
virtual bool apply_preprocessor (bool force_preprocessing=false)
int32_t obtain_by_sliding_window (int32_t window_size, int32_t step_size, int32_t skip=0)
int32_t obtain_by_position_list (int32_t window_size, CDynamicArray< int32_t > *positions, int32_t skip=0)
bool obtain_from_char (CStringFeatures< char > *sf, int32_t start, int32_t p_order, int32_t gap, bool rev)
template<class CT >
bool obtain_from_char_features (CStringFeatures< CT > *sf, int32_t start, int32_t p_order, int32_t gap, bool rev)
bool have_same_length (int32_t len=-1)
void embed_features (int32_t p_order)
void compute_symbol_mask_table (int64_t max_val)
void unembed_word (ST word, uint8_t *seq, int32_t len)
ST embed_word (ST *seq, int32_t len)
void determine_maximum_string_length ()
virtual void set_feature_vector (int32_t num, ST *string, int32_t len)
virtual void get_histogram (float64_t **hist, int32_t *rows, int32_t *cols, bool normalize=true)
virtual void create_random (float64_t *hist, int32_t rows, int32_t cols, int32_t num_vec)
virtual CFeaturescopy_subset (SGVector< index_t > indices)
virtual const char * get_name () const
virtual void subset_changed_post ()
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
EFeatureType get_feature_type () const
template<>
bool get_masked_symbols (bool symbol, uint8_t mask)
template<>
float32_t get_masked_symbols (float32_t symbol, uint8_t mask)
template<>
float64_t get_masked_symbols (float64_t symbol, uint8_t mask)
template<>
floatmax_t get_masked_symbols (floatmax_t symbol, uint8_t mask)
template<>
bool shift_offset (bool symbol, int32_t amount)
template<>
float32_t shift_offset (float32_t symbol, int32_t amount)
template<>
float64_t shift_offset (float64_t symbol, int32_t amount)
template<>
floatmax_t shift_offset (floatmax_t symbol, int32_t amount)
template<>
bool shift_symbol (bool symbol, int32_t amount)
template<>
float32_t shift_symbol (float32_t symbol, int32_t amount)
template<>
float64_t shift_symbol (float64_t symbol, int32_t amount)
template<>
floatmax_t shift_symbol (floatmax_t symbol, int32_t amount)
template<>
bool obtain_from_char_features (CStringFeatures< CT > *sf, int32_t start, int32_t p_order, int32_t gap, bool rev)
template<>
bool obtain_from_char_features (CStringFeatures< CT > *sf, int32_t start, int32_t p_order, int32_t gap, bool rev)
template<>
bool obtain_from_char_features (CStringFeatures< CT > *sf, int32_t start, int32_t p_order, int32_t gap, bool rev)
template<>
void embed_features (int32_t p_order)
template<>
void embed_features (int32_t p_order)
template<>
void embed_features (int32_t p_order)
template<>
void compute_symbol_mask_table (int64_t max_val)
template<>
void compute_symbol_mask_table (int64_t max_val)
template<>
void compute_symbol_mask_table (int64_t max_val)
template<>
float32_t embed_word (float32_t *seq, int32_t len)
template<>
float64_t embed_word (float64_t *seq, int32_t len)
template<>
floatmax_t embed_word (floatmax_t *seq, int32_t len)
template<>
void unembed_word (float32_t word, uint8_t *seq, int32_t len)
template<>
void unembed_word (float64_t word, uint8_t *seq, int32_t len)
template<>
void unembed_word (floatmax_t word, uint8_t *seq, int32_t len)
virtual int32_t add_preprocessor (CPreprocessor *p)
 set preprocessor
virtual CPreprocessordel_preprocessor (int32_t num)
 del current preprocessor
CPreprocessorget_preprocessor (int32_t num) const
 get current preprocessor
void set_preprocessed (int32_t num)
bool is_preprocessed (int32_t num) const
int32_t get_num_preprocessed () const
 get whether specified preprocessor (or all if num=1) was/were already applied
int32_t get_num_preprocessors () const
void clean_preprocessors ()
int32_t get_cache_size () const
virtual bool reshape (int32_t num_features, int32_t num_vectors)
void list_feature_obj () const
bool check_feature_compatibility (CFeatures *f) const
bool has_property (EFeatureProperty p) const
void set_property (EFeatureProperty p)
void unset_property (EFeatureProperty p)
virtual CFeaturescreate_merged_copy (CFeatures *other)
virtual void add_subset (SGVector< index_t > subset)
virtual void remove_subset ()
virtual void remove_all_subsets ()
virtual CSubsetStackget_subset_stack ()
virtual CSGObjectshallow_copy () const
virtual CSGObjectdeep_copy () const
virtual bool is_generic (EPrimitiveType *generic) const
template<class T >
void set_generic ()
void unset_generic ()
virtual void print_serializable (const char *prefix="")
virtual bool save_serializable (CSerializableFile *file, const char *prefix="", int32_t param_version=VERSION_PARAMETER)
virtual bool load_serializable (CSerializableFile *file, const char *prefix="", int32_t param_version=VERSION_PARAMETER)
DynArray< TParameter * > * load_file_parameters (const SGParamInfo *param_info, int32_t file_version, CSerializableFile *file, const char *prefix="")
DynArray< TParameter * > * load_all_file_parameters (int32_t file_version, int32_t current_version, CSerializableFile *file, const char *prefix="")
void map_parameters (DynArray< TParameter * > *param_base, int32_t &base_version, DynArray< const SGParamInfo * > *target_param_infos)
void set_global_io (SGIO *io)
SGIOget_global_io ()
void set_global_parallel (Parallel *parallel)
Parallelget_global_parallel ()
void set_global_version (Version *version)
Versionget_global_version ()
SGStringList< char > get_modelsel_names ()
void print_modsel_params ()
char * get_modsel_param_descr (const char *param_name)
index_t get_modsel_param_index (const char *param_name)
void build_parameter_dictionary (CMap< TParameter *, CSGObject * > &dict)

Static Public Member Functions

static ST * get_zero_terminated_string_copy (SGString< ST > str)

Public Attributes

SGIOio
Parallelparallel
Versionversion
Parameterm_parameters
Parameterm_model_selection_parameters
ParameterMapm_parameter_map
uint32_t m_hash

Protected Member Functions

virtual ST * compute_feature_vector (int32_t num, int32_t &len)
virtual TParametermigrate (DynArray< TParameter * > *param_base, const SGParamInfo *target)
virtual void one_to_one_migration_prepare (DynArray< TParameter * > *param_base, const SGParamInfo *target, TParameter *&replacement, TParameter *&to_migrate, char *old_name=NULL)
virtual void load_serializable_pre () throw (ShogunException)
virtual void load_serializable_post () throw (ShogunException)
virtual void save_serializable_pre () throw (ShogunException)
virtual void save_serializable_post () throw (ShogunException)
virtual bool update_parameter_hash ()

Protected Attributes

CAlphabetalphabet
int32_t num_vectors
SGString< ST > * features
ST * single_string
int32_t length_of_single_string
 length of prior single string
int32_t max_string_length
floatmax_t num_symbols
 number of used symbols
floatmax_t original_num_symbols
 original number of used symbols (before higher order mapping)
int32_t order
 order used in higher order mapping
ST * symbol_mask_table
 order used in higher order mapping
int32_t symbol_mask_table_len
 order used in higher order mapping
bool preprocess_on_get
 preprocess on-the-fly?
CCache< ST > * feature_cache
CSubsetStackm_subset_stack

Constructor & Destructor Documentation

CStringFeatures (  ) 

default constructor

Definition at line 20 of file StringFeatures.cpp.

CStringFeatures ( EAlphabet  alpha  ) 

constructor

Parameters:
alpha alphabet (type) to use for string features

Definition at line 26 of file StringFeatures.cpp.

CStringFeatures ( SGStringList< ST >  string_list,
EAlphabet  alpha 
)

constructor

Parameters:
string_list 
alpha alphabet (type) to use for string features

Definition at line 36 of file StringFeatures.cpp.

CStringFeatures ( SGStringList< ST >  string_list,
CAlphabet alpha 
)

constructor

Parameters:
string_list 
alpha an actual alphabet

Definition at line 48 of file StringFeatures.cpp.

CStringFeatures ( CAlphabet alpha  ) 

constructor

Parameters:
alpha alphabet to use for string features

Definition at line 60 of file StringFeatures.cpp.

CStringFeatures ( const CStringFeatures< ST > &  orig  ) 

copy constructor

Definition at line 72 of file StringFeatures.cpp.

CStringFeatures ( CFile loader,
EAlphabet  alpha = DNA 
)

constructor

Parameters:
loader File object via which to load data
alpha alphabet (type) to use for string features

Definition at line 112 of file StringFeatures.cpp.

~CStringFeatures (  )  [virtual]

Definition at line 127 of file StringFeatures.cpp.


Member Function Documentation

int32_t add_preprocessor ( CPreprocessor p  )  [virtual, inherited]

set preprocessor

add preprocessor

Parameters:
p preprocessor to set
Returns:
something inty

Definition at line 81 of file Features.cpp.

void add_subset ( SGVector< index_t subset  )  [virtual, inherited]

adds a subset of indices on top of the current subsets (possibly subset o subset. Calls subset_changed_post() afterwards

Parameters:
subset subset of indices to add

Reimplemented in CCombinedFeatures.

Definition at line 351 of file Features.cpp.

bool append_features ( SGString< ST > *  p_features,
int32_t  p_num_vectors,
int32_t  p_max_string_length 
)

append features

not possible with subset

Parameters:
p_features features to append
p_num_vectors number of vectors
p_max_string_length maximum string length

note that p_features will be SG_FREE()'d on success

Returns:
if setting was successful

Definition at line 911 of file StringFeatures.cpp.

bool append_features ( CStringFeatures< ST > *  sf  ) 

append features If the given string features have a subset, only this will be copied

not possible with subset

Parameters:
sf features to append
Returns:
if setting was successful

Definition at line 889 of file StringFeatures.cpp.

bool apply_preprocessor ( bool  force_preprocessing = false  )  [virtual]

apply preprocessor

Parameters:
force_preprocessing if preprocssing shall be forced
Returns:
if applying was successful

Definition at line 1169 of file StringFeatures.cpp.

void build_parameter_dictionary ( CMap< TParameter *, CSGObject * > &  dict  )  [inherited]

Builds a dictionary of all parameters in SGObject as well of those of SGObjects that are parameters of this object. Dictionary maps parameters to the objects that own them.

Parameters:
dict dictionary of parameters to be built.

Definition at line 1201 of file SGObject.cpp.

bool check_feature_compatibility ( CFeatures f  )  const [inherited]

check feature compatibility

Parameters:
f features to check for compatibility
Returns:
if features are compatible

Definition at line 326 of file Features.cpp.

void clean_preprocessors (  )  [inherited]

clears all preprocs

Definition at line 137 of file Features.cpp.

void cleanup (  )  [virtual]

cleanup string features.

removes any subset before

Reimplemented in CStringFileFeatures< ST >.

Definition at line 134 of file StringFeatures.cpp.

void cleanup_feature_vector ( int32_t  num  )  [virtual]

cleanup a single feature vector

possible with subset

Parameters:
num number of the vector

Reimplemented in CStringFileFeatures< ST >.

Definition at line 172 of file StringFeatures.cpp.

void cleanup_feature_vectors ( int32_t  start,
int32_t  stop 
) [virtual]

cleanup multiple feature vectors

possible with subset

Parameters:
start index of first vector to be cleaned
stop index of the last vector to be cleaned

Definition at line 187 of file StringFeatures.cpp.

ST * compute_feature_vector ( int32_t  num,
int32_t &  len 
) [protected, virtual]

compute feature vector for sample num if target is set the vector is written to target len is returned by reference

possible with subset

Parameters:
num which vector
len length of vector
Returns:
feature vector

Definition at line 1642 of file StringFeatures.cpp.

void compute_symbol_mask_table ( int64_t  max_val  ) 

Definition at line 1882 of file StringFeatures.cpp.

void compute_symbol_mask_table ( int64_t  max_val  ) 

Definition at line 1885 of file StringFeatures.cpp.

void compute_symbol_mask_table ( int64_t  max_val  ) 

compute symbol mask table

required to access bit-based symbols

not implemented for subset

Definition at line 1366 of file StringFeatures.cpp.

void compute_symbol_mask_table ( int64_t  max_val  ) 

Definition at line 1888 of file StringFeatures.cpp.

SGString< ST > * copy_features ( int32_t &  num_str,
int32_t &  max_str_len 
) [virtual]

copy_features

possible with subset

Parameters:
num_str number of strings (returned)
max_str_len maximal string length (returned)
Returns:
string features

Definition at line 982 of file StringFeatures.cpp.

CFeatures * copy_subset ( SGVector< index_t indices  )  [virtual]

Creates a new CFeatures instance containing copies of the elements which are specified by the provided indices.

possible with subset

Parameters:
indices indices of feature elements to copy
Returns:
new CFeatures instance with copies of feature data

Reimplemented from CFeatures.

Definition at line 1601 of file StringFeatures.cpp.

virtual CFeatures* create_merged_copy ( CFeatures other  )  [virtual, inherited]

Takes another feature instance and returns a new instance which is a concatenation of a copy if this instace's data and the given instance's data. Note that the feature types have to be equal.

NOT IMPLEMENTED!

Parameters:
other feature object to append
Returns:
new feature object which contains copy of data of this instance and of given one

Reimplemented in CCombinedFeatures, CDenseFeatures< ST >, CDenseFeatures< uint32_t >, CDenseFeatures< float64_t >, CDenseFeatures< T >, and CDenseFeatures< uint16_t >.

Definition at line 234 of file Features.h.

void create_random ( float64_t hist,
int32_t  rows,
int32_t  cols,
int32_t  num_vec 
) [virtual]

create some random strings based on normalized histogram

not possible with subset

Definition at line 1499 of file StringFeatures.cpp.

virtual CSGObject* deep_copy (  )  const [virtual, inherited]

A deep copy. All the instance variables will also be copied.

Definition at line 131 of file SGObject.h.

CPreprocessor * del_preprocessor ( int32_t  num  )  [virtual, inherited]

del current preprocessor

delete preprocessor from list caller has to clean up returned preproc

Parameters:
num index of preprocessor in list

Definition at line 143 of file Features.cpp.

void determine_maximum_string_length (  ) 

determine new maximum string length

possible with subset

Definition at line 1422 of file StringFeatures.cpp.

void disable_on_the_fly_preprocessing (  ) 

call this to disable on the fly feature preprocessing on get_feature_vector. Useful when you manually apply preprocessors.

Definition at line 267 of file StringFeatures.cpp.

CFeatures * duplicate (  )  const [virtual]

duplicate feature object

Returns:
feature object

Implements CFeatures.

Definition at line 215 of file StringFeatures.cpp.

void embed_features ( int32_t  p_order  ) 

Definition at line 1872 of file StringFeatures.cpp.

void embed_features ( int32_t  p_order  ) 

Definition at line 1878 of file StringFeatures.cpp.

void embed_features ( int32_t  p_order  ) 

Definition at line 1875 of file StringFeatures.cpp.

void embed_features ( int32_t  p_order  ) 

embed string features in bit representation in-place

not implemented for subset

Definition at line 1312 of file StringFeatures.cpp.

float32_t embed_word ( float32_t seq,
int32_t  len 
)

Definition at line 1892 of file StringFeatures.cpp.

float64_t embed_word ( float64_t seq,
int32_t  len 
)

Definition at line 1896 of file StringFeatures.cpp.

floatmax_t embed_word ( floatmax_t seq,
int32_t  len 
)

Definition at line 1900 of file StringFeatures.cpp.

ST embed_word ( ST *  seq,
int32_t  len 
)

embed a single word

Parameters:
seq sequence of size len in a bitfield
len 

Definition at line 1409 of file StringFeatures.cpp.

void enable_on_the_fly_preprocessing (  ) 

call this to preprocess string features upon get_feature_vector

Definition at line 262 of file StringFeatures.cpp.

void free_feature_vector ( SGVector< ST >  feat_vec,
int32_t  num 
)

free feature vector

possible with subset

Parameters:
feat_vec feature vector to free
num index in feature cache, possibly from subset

Definition at line 372 of file StringFeatures.cpp.

void free_feature_vector ( ST *  feat_vec,
int32_t  num,
bool  dofree 
)

free feature vector

possible with subset

Parameters:
feat_vec feature vector to free
num index in feature cache, possibly from subset
dofree if vector should be really deleted

Definition at line 354 of file StringFeatures.cpp.

CAlphabet * get_alphabet (  ) 

get alphabet used in string features

Returns:
alphabet

Definition at line 209 of file StringFeatures.cpp.

int32_t get_cache_size (  )  const [inherited]

get cache size

Returns:
cache size

Definition at line 203 of file Features.cpp.

ST get_feature ( int32_t  vec_num,
int32_t  feat_num 
) [virtual]

get feature

possible with subset

Parameters:
vec_num which vector
feat_num which feature, possibly from subset
Returns:
feature

Definition at line 387 of file StringFeatures.cpp.

EFeatureClass get_feature_class (  )  const [virtual]

get feature class

Returns:
feature class STRING

Implements CFeatures.

Definition at line 205 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the char feature can deal with

Returns:
feature type char

Implements CFeatures.

Definition at line 1702 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the char feature can deal with

Returns:
feature type char

Implements CFeatures.

Definition at line 1711 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the BYTE feature can deal with

Returns:
feature type BYTE

Implements CFeatures.

Definition at line 1720 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the SHORT feature can deal with

Returns:
feature type SHORT

Implements CFeatures.

Definition at line 1729 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the WORD feature can deal with

Returns:
feature type WORD

Implements CFeatures.

Definition at line 1738 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the INT feature can deal with

Returns:
feature type INT

Implements CFeatures.

Definition at line 1747 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the INT feature can deal with

Returns:
feature type INT

Implements CFeatures.

Definition at line 1756 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the LONG feature can deal with

Returns:
feature type LONG

Implements CFeatures.

Definition at line 1765 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the ULONG feature can deal with

Returns:
feature type ULONG

Implements CFeatures.

Definition at line 1774 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the SHORTREAL feature can deal with

Returns:
feature type SHORTREAL

Implements CFeatures.

Definition at line 1783 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the DREAL feature can deal with

Returns:
feature type DREAL

Implements CFeatures.

Definition at line 1792 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type the LONGREAL feature can deal with

Returns:
feature type LONGREAL

Implements CFeatures.

Definition at line 1801 of file StringFeatures.cpp.

EFeatureType get_feature_type (  )  const [virtual]

get feature type

Returns:
templated feature type

Implements CFeatures.

Definition at line 207 of file StringFeatures.cpp.

SGVector< ST > get_feature_vector ( int32_t  num  ) 

get string for selected example num

possible with subset

Parameters:
num index of the string

Definition at line 220 of file StringFeatures.cpp.

ST * get_feature_vector ( int32_t  num,
int32_t &  len,
bool &  dofree 
)

get feature vector for sample num

possible with subset

Parameters:
num index of feature vector
len length is returned by reference
dofree whether returned vector must be freed by caller via free_feature_vector
Returns:
feature vector for sample num

Definition at line 272 of file StringFeatures.cpp.

SGStringList< ST > get_features (  ) 

get_features

Returns:
features

Definition at line 964 of file StringFeatures.cpp.

SGString< ST > * get_features ( int32_t &  num_str,
int32_t &  max_str_len 
) [virtual]

get_features

not possible with subset

Parameters:
num_str number of strings (returned)
max_str_len maximal string length (returned)
Returns:
string features

Definition at line 972 of file StringFeatures.cpp.

void get_features ( SGString< ST > **  dst,
int32_t *  num_str 
) [virtual]

get_features (swig compatible)

possible with subset

Parameters:
dst string features (returned)
num_str number of strings (returned)

Definition at line 1004 of file StringFeatures.cpp.

SGIO * get_global_io (  )  [inherited]

get the io object

Returns:
io object

Definition at line 224 of file SGObject.cpp.

Parallel * get_global_parallel (  )  [inherited]

get the parallel object

Returns:
parallel object

Definition at line 259 of file SGObject.cpp.

Version * get_global_version (  )  [inherited]

get the version object

Returns:
version object

Definition at line 272 of file SGObject.cpp.

void get_histogram ( float64_t **  hist,
int32_t *  rows,
int32_t *  cols,
bool  normalize = true 
) [virtual]

compute histogram over strings

possible with subset

Definition at line 1457 of file StringFeatures.cpp.

bool get_masked_symbols ( bool  symbol,
uint8_t  mask 
)

Definition at line 1806 of file StringFeatures.cpp.

float32_t get_masked_symbols ( float32_t  symbol,
uint8_t  mask 
)

Definition at line 1810 of file StringFeatures.cpp.

float64_t get_masked_symbols ( float64_t  symbol,
uint8_t  mask 
)

Definition at line 1814 of file StringFeatures.cpp.

floatmax_t get_masked_symbols ( floatmax_t  symbol,
uint8_t  mask 
)

Definition at line 1818 of file StringFeatures.cpp.

ST get_masked_symbols ( ST  symbol,
uint8_t  mask 
)

a higher order mapped symbol will be shaped such that the symbols specified by bits in the mask will be returned.

Parameters:
symbol symbol to mask
mask mask to apply
Returns:
masked symbol

Definition at line 430 of file StringFeatures.cpp.

floatmax_t get_max_num_symbols (  ) 

get maximum number of symbols

Note: floatmax_t sounds weird, but int64_t is not long enough (and there is no int128_t type)

Returns:
maximum number of symbols

Definition at line 424 of file StringFeatures.cpp.

int32_t get_max_vector_length (  )  [virtual]

get maximum vector length

this one is updated when a subset is set

Returns:
maximum vector/string length

Definition at line 412 of file StringFeatures.cpp.

SGStringList< char > get_modelsel_names (  )  [inherited]
Returns:
vector of names of all parameters which are registered for model selection

Definition at line 1108 of file SGObject.cpp.

char * get_modsel_param_descr ( const char *  param_name  )  [inherited]

Returns description of a given parameter string, if it exists. SG_ERROR otherwise

Parameters:
param_name name of the parameter
Returns:
description of the parameter

Definition at line 1132 of file SGObject.cpp.

index_t get_modsel_param_index ( const char *  param_name  )  [inherited]

Returns index of model selection parameter with provided index

Parameters:
param_name name of model selection parameter
Returns:
index of model selection parameter with provided name, -1 if there is no such

Definition at line 1145 of file SGObject.cpp.

virtual const char* get_name (  )  const [virtual]
Returns:
object name

Implements CSGObject.

Definition at line 649 of file StringFeatures.h.

int32_t get_num_preprocessed (  )  const [inherited]

get whether specified preprocessor (or all if num=1) was/were already applied

get the number of applied preprocs

Returns:
number of applied preprocessors

Definition at line 123 of file Features.cpp.

int32_t get_num_preprocessors (  )  const [inherited]

get number of preprocessors

Returns:
number of preprocessors

Definition at line 198 of file Features.cpp.

floatmax_t get_num_symbols (  ) 

get number of symbols

Note: floatmax_t sounds weird, but LONG is not long enough

Returns:
number of symbols

Definition at line 422 of file StringFeatures.cpp.

int32_t get_num_vectors (  )  const [virtual]
Returns:
number of vectors, possibly of subset

Implements CFeatures.

Definition at line 417 of file StringFeatures.cpp.

int32_t get_order (  ) 

order used for higher order mapping

Returns:
order

Definition at line 428 of file StringFeatures.cpp.

floatmax_t get_original_num_symbols (  ) 

number of symbols before higher order mapping

Returns:
original number of symbols

Definition at line 426 of file StringFeatures.cpp.

CPreprocessor * get_preprocessor ( int32_t  num  )  const [inherited]

get current preprocessor

get specified preprocessor

Parameters:
num index of preprocessor in list

Definition at line 111 of file Features.cpp.

int32_t get_size (  )  const [virtual]

get memory footprint of one feature

Returns:
memory footprint of one feature

Implements CFeatures.

Definition at line 1167 of file StringFeatures.cpp.

CSubsetStack * get_subset_stack (  )  [virtual, inherited]

returns subset stack

Returns:
subset stack

Definition at line 369 of file Features.cpp.

CStringFeatures< ST > * get_transposed (  ) 

get a transposed copy of the features

possible with subset

Returns:
transposed copy

Definition at line 310 of file StringFeatures.cpp.

SGString< ST > * get_transposed ( int32_t &  num_feat,
int32_t &  num_vec 
)

compute and return the transpose of string features matrix which will be prepocessed. num_feat, num_vectors are returned by reference caller has to clean up

note that strings all have to have same length

possible with subset

Parameters:
num_feat number of features in matrix
num_vec number of vectors in matrix
Returns:
transposed string features

Definition at line 323 of file StringFeatures.cpp.

int32_t get_vector_length ( int32_t  vec_num  )  [virtual]

get vector length

possible with subset

Parameters:
vec_num which vector, possibly from subset
Returns:
length of vector

Definition at line 401 of file StringFeatures.cpp.

ST * get_zero_terminated_string_copy ( SGString< ST >  str  )  [static]

get a zero terminated copy of the string

Parameters:
str the string to copy
Returns:
zero terminated copy of str

note that this function is only sensible for character strings

Definition at line 1434 of file StringFeatures.cpp.

bool has_property ( EFeatureProperty  p  )  const [inherited]

check if features have given property

Parameters:
p feature property
Returns:
if features have given property

Definition at line 336 of file Features.cpp.

bool have_same_length ( int32_t  len = -1  ) 

check if length of each vector in this feature object equals the given length. if existant, only subset is checked

possible for subset

Parameters:
len vector length to check against
Returns:
if length of each vector in this feature object equals the given length.

Definition at line 1293 of file StringFeatures.cpp.

bool is_generic ( EPrimitiveType *  generic  )  const [virtual, inherited]

If the SGSerializable is a class template then TRUE will be returned and GENERIC is set to the type of the generic.

Parameters:
generic set to the type of the generic if returning TRUE
Returns:
TRUE if a class template.

Definition at line 278 of file SGObject.cpp.

bool is_preprocessed ( int32_t  num  )  const [inherited]

get whether specified preprocessor was already applied

Parameters:
num index of preprocessor in list

Definition at line 193 of file Features.cpp.

void list_feature_obj (  )  const [inherited]

list feature object

Definition at line 214 of file Features.cpp.

virtual void load ( CFile loader  )  [virtual]

load features from file

Parameters:
loader File object via which to load data

Reimplemented from CFeatures.

DynArray< TParameter * > * load_all_file_parameters ( int32_t  file_version,
int32_t  current_version,
CSerializableFile file,
const char *  prefix = "" 
) [inherited]

maps all parameters of this instance to the provided file version and loads all parameter data from the file into an array, which is sorted (basically calls load_file_parameter(...) for all parameters and puts all results into a sorted array)

Parameters:
file_version parameter version of the file
current_version version from which mapping begins (you want to use VERSION_PARAMETER for this in most cases)
file file to load from
prefix prefix for members
Returns:
(sorted) array of created TParameter instances with file data

Definition at line 679 of file SGObject.cpp.

void load_ascii_file ( char *  fname,
bool  remap_to_bin = true,
EAlphabet  ascii_alphabet = DNA,
EAlphabet  binary_alphabet = RAWDNA 
)

load ascii line-based string features from file.

any subset is removed before

Parameters:
fname filename to load from
remap_to_bin if translation to other binary alphabet should be performed
ascii_alphabet src alphabet
binary_alphabet alphabet to translate to

Definition at line 448 of file StringFeatures.cpp.

bool load_compressed ( char *  src,
bool  decompress 
) [virtual]

load compressed features from file

any subset is removed before

Parameters:
src filename to load from
decompress whether to decompress on loading
Returns:
if loading was successful

Definition at line 1012 of file StringFeatures.cpp.

bool load_fasta_file ( const char *  fname,
bool  ignore_invalid = false 
)

load fasta file as string features

any subset is removed before

Parameters:
fname filename to load from
ignore_invalid if set to true, characters other than A,C,G,T are converted to A
Returns:
if loading was successful

Definition at line 581 of file StringFeatures.cpp.

bool load_fastq_file ( const char *  fname,
bool  ignore_invalid = false,
bool  bitremap_in_single_string = false 
)

load fastq file as string features

removes subset beforehand

Parameters:
fname filename to load from
ignore_invalid if set to true, characters other than A,C,G,T are converted to A
bitremap_in_single_string if set to true, do binary embedding of symbols
Returns:
if loading was successful

Definition at line 674 of file StringFeatures.cpp.

DynArray< TParameter * > * load_file_parameters ( const SGParamInfo param_info,
int32_t  file_version,
CSerializableFile file,
const char *  prefix = "" 
) [inherited]

loads some specified parameters from a file with a specified version The provided parameter info has a version which is recursively mapped until the file parameter version is reached. Note that there may be possibly multiple parameters in the mapping, therefore, a set of TParameter instances is returned

Parameters:
param_info information of parameter
file_version parameter version of the file, must be <= provided parameter version
file file to load from
prefix prefix for members
Returns:
new array with TParameter instances with the attached data

Definition at line 523 of file SGObject.cpp.

bool load_from_directory ( char *  dirname  ) 

load features from directory

removes subset before

Parameters:
dirname directory name to load from
Returns:
if loading was successful

Definition at line 775 of file StringFeatures.cpp.

bool load_serializable ( CSerializableFile file,
const char *  prefix = "",
int32_t  param_version = VERSION_PARAMETER 
) [virtual, inherited]

Load this object from file. If it will fail (returning FALSE) then this object will contain inconsistent data and should not be used!

Parameters:
file where to load from
prefix prefix for members
param_version (optional) a parameter version different to (this is mainly for testing, better do not use)
Returns:
TRUE if done, otherwise FALSE

Reimplemented in CModelSelectionParameters.

Definition at line 354 of file SGObject.cpp.

void load_serializable_post (  )  throw (ShogunException) [protected, virtual, inherited]

Can (optionally) be overridden to post-initialize some member variables which are not PARAMETER::ADD'ed. Make sure that at first the overridden method BASE_CLASS::LOAD_SERIALIZABLE_POST is called.

Exceptions:
ShogunException Will be thrown if an error occurres.

Reimplemented in CLinearHMM, CAlphabet, CANOVAKernel, CCircularKernel, CExponentialKernel, CGaussianKernel, CInverseMultiQuadricKernel, CKernel, CWeightedDegreePositionStringKernel, and CList.

Definition at line 1033 of file SGObject.cpp.

void load_serializable_pre (  )  throw (ShogunException) [protected, virtual, inherited]

Can (optionally) be overridden to pre-initialize some member variables which are not PARAMETER::ADD'ed. Make sure that at first the overridden method BASE_CLASS::LOAD_SERIALIZABLE_PRE is called.

Exceptions:
ShogunException Will be thrown if an error occurres.

Definition at line 1028 of file SGObject.cpp.

void map_parameters ( DynArray< TParameter * > *  param_base,
int32_t &  base_version,
DynArray< const SGParamInfo * > *  target_param_infos 
) [inherited]

Takes a set of TParameter instances (base) with a certain version and a set of target parameter infos and recursively maps the base level wise to the current version using CSGObject::migrate(...). The base is replaced. After this call, the base version containing parameters should be of same version/type as the initial target parameter infos. Note for this to work, the migrate methods and all the internal parameter mappings have to match

Parameters:
param_base set of TParameter instances that are mapped to the provided target parameter infos
base_version version of the parameter base
target_param_infos set of SGParamInfo instances that specify the target parameter base

Definition at line 717 of file SGObject.cpp.

TParameter * migrate ( DynArray< TParameter * > *  param_base,
const SGParamInfo target 
) [protected, virtual, inherited]

creates a new TParameter instance, which contains migrated data from the version that is provided. The provided parameter data base is used for migration, this base is a collection of all parameter data of the previous version. Migration is done FROM the data in param_base TO the provided param info Migration is always one version step. Method has to be implemented in subclasses, if no match is found, base method has to be called.

If there is an element in the param_base which equals the target, a copy of the element is returned. This represents the case when nothing has changed and therefore, the migrate method is not overloaded in a subclass

Parameters:
param_base set of TParameter instances to use for migration
target parameter info for the resulting TParameter
Returns:
a new TParameter instance with migrated data from the base of the type which is specified by the target parameter

Definition at line 923 of file SGObject.cpp.

int32_t obtain_by_position_list ( int32_t  window_size,
CDynamicArray< int32_t > *  positions,
int32_t  skip = 0 
)

extracts windows of size window_size from first string using the positions in list

not implemented for subset

Parameters:
window_size window size
positions positions
skip skip
Returns:
something inty

Definition at line 1230 of file StringFeatures.cpp.

int32_t obtain_by_sliding_window ( int32_t  window_size,
int32_t  step_size,
int32_t  skip = 0 
)

slides a window of size window_size over the current single string step_size is the amount by which the window is shifted. creates (string_len-window_size)/step_size many feature obj if skip is nonzero, skip the first 'skip' characters of each string

not implemented for subset

Parameters:
window_size window size
step_size step size
skip skip
Returns:
something inty

Definition at line 1193 of file StringFeatures.cpp.

bool obtain_from_char ( CStringFeatures< char > *  sf,
int32_t  start,
int32_t  p_order,
int32_t  gap,
bool  rev 
)

obtain string features from char features

wrapper for template method

any subset is removed before, subset of parameter sf is possible

Parameters:
sf string features
start start
p_order order
gap gap
rev reverse
Returns:
if obtaining was successful

Definition at line 1288 of file StringFeatures.cpp.

bool obtain_from_char_features ( CStringFeatures< CT > *  sf,
int32_t  start,
int32_t  p_order,
int32_t  gap,
bool  rev 
)

Definition at line 1862 of file StringFeatures.cpp.

bool obtain_from_char_features ( CStringFeatures< CT > *  sf,
int32_t  start,
int32_t  p_order,
int32_t  gap,
bool  rev 
)

Definition at line 1858 of file StringFeatures.cpp.

bool obtain_from_char_features ( CStringFeatures< CT > *  sf,
int32_t  start,
int32_t  p_order,
int32_t  gap,
bool  rev 
)

Definition at line 1866 of file StringFeatures.cpp.

bool obtain_from_char_features ( CStringFeatures< CT > *  sf,
int32_t  start,
int32_t  p_order,
int32_t  gap,
bool  rev 
)

template obtain from char features

any subset is removed before, subset of parameter sf is possible

Parameters:
sf string features
start start
p_order order
gap gap
rev reverse
Returns:
if obtaining was successful

Definition at line 1970 of file StringFeatures.cpp.

void one_to_one_migration_prepare ( DynArray< TParameter * > *  param_base,
const SGParamInfo target,
TParameter *&  replacement,
TParameter *&  to_migrate,
char *  old_name = NULL 
) [protected, virtual, inherited]

This method prepares everything for a one-to-one parameter migration. One to one here means that only ONE element of the parameter base is needed for the migration (the one with the same name as the target). Data is allocated for the target (in the type as provided in the target SGParamInfo), and a corresponding new TParameter instance is written to replacement. The to_migrate pointer points to the single needed TParameter instance needed for migration. If a name change happened, the old name may be specified by old_name. In addition, the m_delete_data flag of to_migrate is set to true. So if you want to migrate data, the only thing to do after this call is converting the data in the m_parameter fields. If unsure how to use - have a look into an example for this. (base_migration_type_conversion.cpp for example)

Parameters:
param_base set of TParameter instances to use for migration
target parameter info for the resulting TParameter
replacement (used as output) here the TParameter instance which is returned by migration is created into
to_migrate the only source that is used for migration
old_name with this parameter, a name change may be specified

Definition at line 864 of file SGObject.cpp.

void print_modsel_params (  )  [inherited]

prints all parameter registered for model selection and their type

Definition at line 1084 of file SGObject.cpp.

void print_serializable ( const char *  prefix = ""  )  [virtual, inherited]

prints registered parameters out

Parameters:
prefix prefix for members

Definition at line 290 of file SGObject.cpp.

void remove_all_subsets (  )  [virtual, inherited]

removes all subsets Calls subset_changed_post() afterwards

Reimplemented in CCombinedFeatures.

Definition at line 363 of file Features.cpp.

void remove_subset (  )  [virtual, inherited]

removes that last added subset from subset stack, if existing Calls subset_changed_post() afterwards

Reimplemented in CCombinedFeatures.

Definition at line 357 of file Features.cpp.

bool reshape ( int32_t  num_features,
int32_t  num_vectors 
) [virtual, inherited]

in case there is a feature matrix allow for reshaping

NOT IMPLEMENTED!

Parameters:
num_features new number of features
num_vectors new number of vectors
Returns:
if reshaping was successful

Reimplemented in CDenseFeatures< ST >, CDenseFeatures< uint32_t >, CDenseFeatures< float64_t >, CDenseFeatures< T >, and CDenseFeatures< uint16_t >.

Definition at line 208 of file Features.cpp.

virtual void save ( CFile writer  )  [virtual]

save features to file

not possible with subset

Parameters:
writer File object via which to save data

Reimplemented from CFeatures.

bool save_compressed ( char *  dest,
E_COMPRESSION_TYPE  compression,
int  level 
) [virtual]

save compressed features to file

not possible with subset

Parameters:
dest filename to save to
compression compressor to use
level compression level to use (1-9)
Returns:
if saving was successful

Definition at line 1107 of file StringFeatures.cpp.

bool save_serializable ( CSerializableFile file,
const char *  prefix = "",
int32_t  param_version = VERSION_PARAMETER 
) [virtual, inherited]

Save this object to file.

Parameters:
file where to save the object; will be closed during returning if PREFIX is an empty string.
prefix prefix for members
param_version (optional) a parameter version different to (this is mainly for testing, better do not use)
Returns:
TRUE if done, otherwise FALSE

Reimplemented in CModelSelectionParameters.

Definition at line 296 of file SGObject.cpp.

void save_serializable_post (  )  throw (ShogunException) [protected, virtual, inherited]

Can (optionally) be overridden to post-initialize some member variables which are not PARAMETER::ADD'ed. Make sure that at first the overridden method BASE_CLASS::SAVE_SERIALIZABLE_POST is called.

Exceptions:
ShogunException Will be thrown if an error occurres.

Reimplemented in CKernel.

Definition at line 1043 of file SGObject.cpp.

void save_serializable_pre (  )  throw (ShogunException) [protected, virtual, inherited]

Can (optionally) be overridden to pre-initialize some member variables which are not PARAMETER::ADD'ed. Make sure that at first the overridden method BASE_CLASS::SAVE_SERIALIZABLE_PRE is called.

Exceptions:
ShogunException Will be thrown if an error occurres.

Reimplemented in CKernel.

Definition at line 1038 of file SGObject.cpp.

void set_feature_vector ( SGVector< ST >  vector,
int32_t  num 
)

set string for selected example num

not possible with subset

Parameters:
vector 
num index of the string

Definition at line 238 of file StringFeatures.cpp.

void set_feature_vector ( int32_t  num,
ST *  string,
int32_t  len 
) [virtual]

set feature vector for sample num

possible with subset

Parameters:
num index of feature vector
string string with the feature vector's content
len length of the string

Definition at line 1443 of file StringFeatures.cpp.

void set_features ( SGStringList< ST >  feats  ) 

set features

not possible with subset

Definition at line 845 of file StringFeatures.cpp.

bool set_features ( SGString< ST > *  p_features,
int32_t  p_num_vectors,
int32_t  p_max_string_length 
)

set features

not possible with subset

Parameters:
p_features new features
p_num_vectors number of vectors
p_max_string_length maximum string length
Returns:
if setting was successful

Definition at line 850 of file StringFeatures.cpp.

void set_generic< floatmax_t > (  )  [inherited]

set generic type to T

void set_global_io ( SGIO io  )  [inherited]

set the io object

Parameters:
io io object to use

Definition at line 217 of file SGObject.cpp.

void set_global_parallel ( Parallel parallel  )  [inherited]

set the parallel object

Parameters:
parallel parallel object to use

Definition at line 230 of file SGObject.cpp.

void set_global_version ( Version version  )  [inherited]

set the version object

Parameters:
version version object to use

Definition at line 265 of file SGObject.cpp.

void set_preprocessed ( int32_t  num  )  [inherited]

set applied flag for preprocessor

Parameters:
num index of preprocessor in list

Definition at line 188 of file Features.cpp.

void set_property ( EFeatureProperty  p  )  [inherited]

set property

Parameters:
p kernel property to set

Definition at line 341 of file Features.cpp.

virtual CSGObject* shallow_copy (  )  const [virtual, inherited]

A shallow copy. All the SGObject instance variables will be simply assigned and SG_REF-ed.

Reimplemented in CGaussianKernel.

Definition at line 122 of file SGObject.h.

floatmax_t shift_offset ( floatmax_t  symbol,
int32_t  amount 
)

Definition at line 1835 of file StringFeatures.cpp.

bool shift_offset ( bool  symbol,
int32_t  amount 
)

Definition at line 1823 of file StringFeatures.cpp.

float32_t shift_offset ( float32_t  symbol,
int32_t  amount 
)

Definition at line 1827 of file StringFeatures.cpp.

ST shift_offset ( ST  offset,
int32_t  amount 
)

shift offset to the left by amount

Parameters:
offset offset to shift
amount amount to shift the offset
Returns:
shifted offset

Definition at line 436 of file StringFeatures.cpp.

float64_t shift_offset ( float64_t  symbol,
int32_t  amount 
)

Definition at line 1831 of file StringFeatures.cpp.

ST shift_symbol ( ST  symbol,
int32_t  amount 
)

shift symbol to the right by amount (taking care of custom symbol sizes)

Parameters:
symbol symbol to shift
amount amount to shift the symbol
Returns:
shifted symbol

Definition at line 442 of file StringFeatures.cpp.

floatmax_t shift_symbol ( floatmax_t  symbol,
int32_t  amount 
)

Definition at line 1852 of file StringFeatures.cpp.

bool shift_symbol ( bool  symbol,
int32_t  amount 
)

Definition at line 1840 of file StringFeatures.cpp.

float64_t shift_symbol ( float64_t  symbol,
int32_t  amount 
)

Definition at line 1848 of file StringFeatures.cpp.

float32_t shift_symbol ( float32_t  symbol,
int32_t  amount 
)

Definition at line 1844 of file StringFeatures.cpp.

void subset_changed_post (  )  [virtual]

post method when subset is changed

Reimplemented from CFeatures.

Definition at line 1636 of file StringFeatures.cpp.

void unembed_word ( float64_t  word,
uint8_t *  seq,
int32_t  len 
)

Definition at line 1908 of file StringFeatures.cpp.

void unembed_word ( ST  word,
uint8_t *  seq,
int32_t  len 
)

remap bit-based word to character sequence

Parameters:
word word to remap
seq sequence of size len that remapped characters are written to
len length of sequence and word

Definition at line 1393 of file StringFeatures.cpp.

void unembed_word ( float32_t  word,
uint8_t *  seq,
int32_t  len 
)

Definition at line 1905 of file StringFeatures.cpp.

void unembed_word ( floatmax_t  word,
uint8_t *  seq,
int32_t  len 
)

Definition at line 1911 of file StringFeatures.cpp.

void unset_generic (  )  [inherited]

unset generic type

this has to be called in classes specializing a template class

Definition at line 285 of file SGObject.cpp.

void unset_property ( EFeatureProperty  p  )  [inherited]

unset property

Parameters:
p kernel property to unset

Definition at line 346 of file Features.cpp.

bool update_parameter_hash (  )  [protected, virtual, inherited]

Updates the hash of current parameter combination.

Returns:
bool if parameter combination has changed since last update.

Definition at line 237 of file SGObject.cpp.


Member Data Documentation

CAlphabet* alphabet [protected]

alphabet

Definition at line 673 of file StringFeatures.h.

CCache<ST>* feature_cache [protected]

feature cache

Definition at line 709 of file StringFeatures.h.

SGString<ST>* features [protected]

this contains the array of features

Definition at line 679 of file StringFeatures.h.

SGIO* io [inherited]

io

Definition at line 462 of file SGObject.h.

int32_t length_of_single_string [protected]

length of prior single string

Definition at line 685 of file StringFeatures.h.

uint32_t m_hash [inherited]

Hash of parameter values

Definition at line 480 of file SGObject.h.

model selection parameters

Definition at line 474 of file SGObject.h.

map for different parameter versions

Definition at line 477 of file SGObject.h.

Parameter* m_parameters [inherited]

parameters

Definition at line 471 of file SGObject.h.

CSubsetStack* m_subset_stack [protected, inherited]

subset used for index transformations

Definition at line 296 of file Features.h.

int32_t max_string_length [protected]

length of longest string (for subset, is updated)

Definition at line 688 of file StringFeatures.h.

floatmax_t num_symbols [protected]

number of used symbols

Definition at line 691 of file StringFeatures.h.

int32_t num_vectors [protected]

number of string vectors (for subset, is updated)

Definition at line 676 of file StringFeatures.h.

int32_t order [protected]

order used in higher order mapping

Definition at line 697 of file StringFeatures.h.

original number of used symbols (before higher order mapping)

Definition at line 694 of file StringFeatures.h.

Parallel* parallel [inherited]

parallel

Definition at line 465 of file SGObject.h.

bool preprocess_on_get [protected]

preprocess on-the-fly?

Definition at line 706 of file StringFeatures.h.

ST* single_string [protected]

true when single string / created by sliding window

Definition at line 682 of file StringFeatures.h.

ST* symbol_mask_table [protected]

order used in higher order mapping

Definition at line 700 of file StringFeatures.h.

int32_t symbol_mask_table_len [protected]

order used in higher order mapping

Definition at line 703 of file StringFeatures.h.

Version* version [inherited]

version

Definition at line 468 of file SGObject.h.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines

SHOGUN Machine Learning Toolbox - Documentation