File based string features.
StringFeatures that are file based. Underneath memory mapped files are used. Derived from CStringFeatures thus transparently enabling all of the StringFeature functionality.
Supported file format contains one string per line, lines of variable length are supported and must be separated by '
'.
Definition at line 34 of file StringFileFeatures.h.
Public Member Functions | |
CStringFileFeatures () | |
CStringFileFeatures (const char *fname, EAlphabet alpha) | |
virtual | ~CStringFileFeatures () |
Protected Member Functions | |
ST * | get_line (uint64_t &len, uint64_t &offs, int32_t &line_nr, uint64_t file_length) |
virtual void | cleanup () |
virtual void | cleanup_feature_vector (int32_t num) |
void | fetch_meta_info_from_file (int32_t granularity=1048576) |
Protected Attributes | |
CMemoryMappedFile< ST > * | file |
default constructor
Definition at line 41 of file StringFileFeatures.h.
CStringFileFeatures | ( | const char * | fname, | |
EAlphabet | alpha | |||
) |
constructor
fname | filename of the file containing line based features | |
alpha | alphabet (type) to use for string features |
Definition at line 50 of file StringFileFeatures.h.
virtual ~CStringFileFeatures | ( | ) | [virtual] |
default destructor
Definition at line 60 of file StringFileFeatures.h.
virtual void cleanup | ( | ) | [protected, virtual] |
cleanup string features
Reimplemented from CStringFeatures< ST >.
Definition at line 112 of file StringFileFeatures.h.
virtual void cleanup_feature_vector | ( | int32_t | num | ) | [protected, virtual] |
cleanup a single feature vector
Reimplemented from CStringFeatures< ST >.
Definition at line 131 of file StringFileFeatures.h.
void fetch_meta_info_from_file | ( | int32_t | granularity = 1048576 |
) | [protected] |
obtain meta information from file
i.e., determine number of strings and their lengths
Definition at line 141 of file StringFileFeatures.h.
ST* get_line | ( | uint64_t & | len, | |
uint64_t & | offs, | |||
int32_t & | line_nr, | |||
uint64_t | file_length | |||
) | [protected] |
get next line from file
The returned line may be modfied in case the file was opened read/write. It is otherwise read-only.
len | length of line (returned via reference) | |
offs | offset to be passed for reading next line, should be 0 initially (returned via reference) | |
line_nr | used to indicate errors (returned as reference should be 0 initially) | |
file_length | total length of the file (for error checking) |
Definition at line 81 of file StringFileFeatures.h.
CMemoryMappedFile<ST>* file [protected] |
memory mapped file
Definition at line 190 of file StringFileFeatures.h.