This class implements streaming features for use with VW.
Each example is stored in a VwExample object, which also contains label and other information. Features are hashed and are supposed to be used with a weight vector of preallocated dimensions.
Definition at line 39 of file StreamingVwFeatures.h.
Public Member Functions | |
CStreamingVwFeatures () | |
CStreamingVwFeatures (CStreamingVwFile *file, bool is_labelled, int32_t size) | |
CStreamingVwFeatures (CStreamingVwCacheFile *file, bool is_labelled, int32_t size) | |
~CStreamingVwFeatures () | |
CFeatures * | duplicate () const |
virtual void | set_vector_reader () |
virtual void | set_vector_and_label_reader () |
virtual void | start_parser () |
virtual void | end_parser () |
virtual void | reset_stream () |
virtual CVwEnvironment * | get_env () |
virtual void | set_env (CVwEnvironment *vw_env) |
virtual bool | get_next_example () |
virtual VwExample * | get_example () |
virtual float64_t | get_label () |
virtual void | release_example () |
virtual void | expand_if_required (float32_t *&vec, int32_t &len) |
virtual void | expand_if_required (float64_t *&vec, int32_t &len) |
virtual int32_t | get_dim_feature_space () const |
virtual float32_t | real_weight (float32_t w, float32_t gravity) |
virtual float32_t | dot (CStreamingDotFeatures *df) |
virtual float32_t | dense_dot (VwExample *&ex, const float32_t *vec2) |
virtual float32_t | dense_dot (const float32_t *vec2, int32_t vec2_len) |
virtual float32_t | dense_dot (SGSparseVector< float32_t > *vec1, const float32_t *vec2) |
virtual float32_t | dense_dot_truncated (const float32_t *vec2, VwExample *&ex, float32_t gravity) |
virtual void | add_to_dense_vec (float32_t alpha, VwExample *&ex, float32_t *vec2, int32_t vec2_len, bool abs_val=false) |
virtual void | add_to_dense_vec (float32_t alpha, float32_t *vec2, int32_t vec2_len, bool abs_val=false) |
virtual int32_t | get_nnz_features_for_vector () |
virtual int32_t | get_num_features () |
virtual EFeatureType | get_feature_type () |
virtual EFeatureClass | get_feature_class () |
virtual const char * | get_name () const |
virtual int32_t | get_num_vectors () const |
virtual int32_t | get_size () |
Protected Attributes | |
CInputParser< VwExample > | parser |
The parser object, which reads from input and returns parsed example objects. | |
vw_size_t | example_count |
Number of examples processed at a point of time. | |
float64_t | current_label |
The current example's label. | |
int32_t | current_length |
Number of features in current example. | |
CVwEnvironment * | env |
Environment for VW. | |
VwExample * | current_example |
Example currently being processed. |
Default constructor.
Sets the reading functions to be CStreamingFile::get_*_vector and get_*_vector_and_label depending on the type T.
Definition at line 50 of file StreamingVwFeatures.h.
CStreamingVwFeatures | ( | CStreamingVwFile * | file, | |
bool | is_labelled, | |||
int32_t | size | |||
) |
Constructor taking args. Initializes the parser with the given args.
file | StreamingFile object, input file. | |
is_labelled | Whether examples are labelled or not. | |
size | Number of example objects to be stored in the parser at a time. |
Definition at line 65 of file StreamingVwFeatures.h.
CStreamingVwFeatures | ( | CStreamingVwCacheFile * | file, | |
bool | is_labelled, | |||
int32_t | size | |||
) |
Constructor used when initialized with a cache file.
file | StreamingVwCacheFile object | |
is_labelled | Whether examples are labelled or not | |
size | Number of example objects to be stored in the parser at a time |
Definition at line 82 of file StreamingVwFeatures.h.
~CStreamingVwFeatures | ( | ) |
Destructor.
Ends the parsing thread. (Waits for pthread_join to complete)
Definition at line 96 of file StreamingVwFeatures.h.
void add_to_dense_vec | ( | float32_t | alpha, | |
VwExample *& | ex, | |||
float32_t * | vec2, | |||
int32_t | vec2_len, | |||
bool | abs_val = false | |||
) | [virtual] |
Add alpha*an example's feature vector to another dense vector. Takes the absolute value of current_vector if specified
alpha | alpha | |
ex | example whose vector should be used | |
vec2 | vector to add to | |
vec2_len | length of vector | |
abs_val | true if abs of example's vector should be taken |
Definition at line 243 of file StreamingVwFeatures.cpp.
void add_to_dense_vec | ( | float32_t | alpha, | |
float32_t * | vec2, | |||
int32_t | vec2_len, | |||
bool | abs_val = false | |||
) | [virtual] |
Add alpha*current_vector to another dense vector. Takes the absolute value of current_vector if specified
alpha | alpha | |
vec2 | vector to add to | |
vec2_len | length of vector | |
abs_val | true if abs of current_vector should be taken |
Implements CStreamingDotFeatures.
Definition at line 263 of file StreamingVwFeatures.cpp.
Dot product of an example with a vector
ex | example, as VwExample | |
vec2 | vector to take dot product with |
Definition at line 202 of file StreamingVwFeatures.cpp.
Dot product of current feature vector with a dense vector which stores weights in hashed indices
vec2 | dense weight vector | |
vec2_len | length of weight vector (not used) |
Implements CStreamingDotFeatures.
Definition at line 213 of file StreamingVwFeatures.cpp.
float32_t dense_dot | ( | SGSparseVector< float32_t > * | vec1, | |
const float32_t * | vec2 | |||
) | [virtual] |
Dot product between a dense weight vector and a sparse feature vector. Assumes the features to belong to the constant namespace.
vec1 | sparse feature vector | |
vec2 | weight vector |
Definition at line 218 of file StreamingVwFeatures.cpp.
float32_t dense_dot_truncated | ( | const float32_t * | vec2, | |
VwExample *& | ex, | |||
float32_t | gravity | |||
) | [virtual] |
Calculate dot product of features with another vector, truncating the elements of that vector by magnitude 'gravity' to a minimum final magnitude of zero.
vec2 | vector to take dot product with | |
ex | example whose features have to be taken | |
gravity | value to use for truncating |
Definition at line 227 of file StreamingVwFeatures.cpp.
float32_t dot | ( | CStreamingDotFeatures * | df | ) | [virtual] |
Dot product taken with another StreamingDotFeatures object.
Currently only works if it is a CStreamingVwFeatures object. It takes the dot product of the current_vectors of both objects.
df | CStreamingDotFeatures object. |
Implements CStreamingDotFeatures.
Definition at line 196 of file StreamingVwFeatures.cpp.
CFeatures* duplicate | ( | ) | const [virtual] |
Duplicate this object
Implements CFeatures.
Definition at line 107 of file StreamingVwFeatures.h.
void end_parser | ( | ) | [virtual] |
Ends the parsing thread.
Waits for the thread to join.
Implements CStreamingFeatures.
Definition at line 137 of file StreamingVwFeatures.cpp.
virtual void expand_if_required | ( | float32_t *& | vec, | |
int32_t & | len | |||
) | [virtual] |
Expand the vector passed so that it its length is equal to the dimensionality of the features. The previous values are kept intact through realloc, and the new ones are set to zero.
vec | float32_t* vector | |
len | length of the vector |
Reimplemented from CStreamingDotFeatures.
Definition at line 229 of file StreamingVwFeatures.h.
virtual void expand_if_required | ( | float64_t *& | vec, | |
int32_t & | len | |||
) | [virtual] |
Expand the vector passed so that it its length is equal to the dimensionality of the features. The previous values are kept intact through realloc, and the new ones are set to zero.
vec | float64_t* vector | |
len | length of the vector |
Reimplemented from CStreamingDotFeatures.
Definition at line 248 of file StreamingVwFeatures.h.
int32_t get_dim_feature_space | ( | ) | const [virtual] |
obtain the dimensionality of the feature space
(not mix this up with the dimensionality of the input space, usually obtained via get_num_features())
Implements CStreamingDotFeatures.
Definition at line 191 of file StreamingVwFeatures.cpp.
virtual CVwEnvironment* get_env | ( | ) | [virtual] |
VwExample * get_example | ( | ) | [virtual] |
Returns the current example.
Definition at line 162 of file StreamingVwFeatures.cpp.
EFeatureClass get_feature_class | ( | ) | [virtual] |
Return the feature class
Implements CFeatures.
Definition at line 273 of file StreamingVwFeatures.cpp.
EFeatureType get_feature_type | ( | ) | [virtual] |
Return the feature type, depending on T.
Implements CFeatures.
Definition at line 30 of file StreamingVwFeatures.cpp.
float64_t get_label | ( | ) | [virtual] |
Return the label of the current example as a float.
Examples must be labelled, otherwise an error occurs.
Implements CStreamingFeatures.
Definition at line 167 of file StreamingVwFeatures.cpp.
virtual const char* get_name | ( | void | ) | const [virtual] |
Return the name.
Implements CSGObject.
Definition at line 398 of file StreamingVwFeatures.h.
bool get_next_example | ( | ) | [virtual] |
Instructs the parser to return the next example.
This example is stored as the current_example in this object.
Implements CStreamingFeatures.
Definition at line 142 of file StreamingVwFeatures.cpp.
virtual int32_t get_nnz_features_for_vector | ( | ) | [virtual] |
get number of non-zero features in vector
Reimplemented from CStreamingDotFeatures.
Definition at line 367 of file StreamingVwFeatures.h.
int32_t get_num_features | ( | ) | [virtual] |
Return the number of features in the current example.
Implements CStreamingFeatures.
Definition at line 268 of file StreamingVwFeatures.cpp.
virtual int32_t get_num_vectors | ( | ) | const [virtual] |
Return the number of vectors stored in this object.
Implements CFeatures.
Definition at line 405 of file StreamingVwFeatures.h.
virtual int32_t get_size | ( | ) | [virtual] |
Return the size of one T object.
Implements CFeatures.
Definition at line 418 of file StreamingVwFeatures.h.
Reduce element 'w' to max(w-gravity, 0)
w | value to truncate | |
gravity | value to truncate using |
Definition at line 276 of file StreamingVwFeatures.h.
void release_example | ( | ) | [virtual] |
Release the current example, indicating to the parser that it has been processed by the learning algorithm.
The parser is then free to throw away that example.
Implements CStreamingFeatures.
Definition at line 174 of file StreamingVwFeatures.cpp.
virtual void reset_stream | ( | ) | [virtual] |
Reset the file back to the first example. Only works for cache files.
Reimplemented from CStreamingFeatures.
Definition at line 152 of file StreamingVwFeatures.h.
virtual void set_env | ( | CVwEnvironment * | vw_env | ) | [virtual] |
Set the environment
vw_env | environment |
Definition at line 181 of file StreamingVwFeatures.h.
void set_vector_and_label_reader | ( | ) | [virtual] |
Sets the read function (in case the examples are labelled) to get_*_vector_and_label from CStreamingFile.
The exact function depends on type T.
The parser uses the function set by this while reading labelled examples.
Implements CStreamingFeatures.
Definition at line 25 of file StreamingVwFeatures.cpp.
void set_vector_reader | ( | ) | [virtual] |
Sets the read function (in case the examples are unlabelled) to get_*_vector() from CStreamingFile.
The exact function depends on type T.
The parser uses the function set by this while reading unlabelled examples.
Implements CStreamingFeatures.
Definition at line 20 of file StreamingVwFeatures.cpp.
void start_parser | ( | ) | [virtual] |
Starts the parsing thread.
To be called before trying to use any feature vectors from this object.
Implements CStreamingFeatures.
Definition at line 131 of file StreamingVwFeatures.cpp.
VwExample* current_example [protected] |
Example currently being processed.
Definition at line 471 of file StreamingVwFeatures.h.
float64_t current_label [protected] |
The current example's label.
Definition at line 462 of file StreamingVwFeatures.h.
int32_t current_length [protected] |
Number of features in current example.
Definition at line 465 of file StreamingVwFeatures.h.
CVwEnvironment* env [protected] |
Environment for VW.
Definition at line 468 of file StreamingVwFeatures.h.
vw_size_t example_count [protected] |
Number of examples processed at a point of time.
Definition at line 459 of file StreamingVwFeatures.h.
CInputParser<VwExample> parser [protected] |
The parser object, which reads from input and returns parsed example objects.
Definition at line 456 of file StreamingVwFeatures.h.