File based string features.
StringFeatures that are file based. Underneath memory mapped files are used. Derived from CStringFeatures thus transparently enabling all of the StringFeature functionality.
Supported file format contains one string per line, lines of variable length are supported and must be separated by '
'.
Definition at line 34 of file StringFileFeatures.h.
Public Member Functions | |
CStringFileFeatures () | |
CStringFileFeatures (const char *fname, EAlphabet alpha) | |
virtual | ~CStringFileFeatures () |
Protected Member Functions | |
ST * | get_line (uint64_t &len, uint64_t &offs, int32_t &line_nr, uint64_t file_length) |
virtual void | cleanup () |
virtual void | cleanup_feature_vector (int32_t num) |
void | fetch_meta_info_from_file (int32_t granularity=1048576) |
Protected Attributes | |
CMemoryMappedFile< ST > * | file |
default constructor
Definition at line 6 of file StringFileFeatures.cpp.
CStringFileFeatures | ( | const char * | fname, | |
EAlphabet | alpha | |||
) |
constructor
fname | filename of the file containing line based features | |
alpha | alphabet (type) to use for string features |
Definition at line 10 of file StringFileFeatures.cpp.
~CStringFileFeatures | ( | ) | [virtual] |
default destructor
Definition at line 17 of file StringFileFeatures.cpp.
void cleanup | ( | ) | [protected, virtual] |
cleanup string features
Reimplemented from CStringFeatures< ST >.
Definition at line 53 of file StringFileFeatures.cpp.
void cleanup_feature_vector | ( | int32_t | num | ) | [protected, virtual] |
cleanup a single feature vector
Reimplemented from CStringFeatures< ST >.
Definition at line 71 of file StringFileFeatures.cpp.
void fetch_meta_info_from_file | ( | int32_t | granularity = 1048576 |
) | [protected] |
obtain meta information from file
i.e., determine number of strings and their lengths
Definition at line 77 of file StringFileFeatures.cpp.
ST * get_line | ( | uint64_t & | len, | |
uint64_t & | offs, | |||
int32_t & | line_nr, | |||
uint64_t | file_length | |||
) | [protected] |
get next line from file
The returned line may be modfied in case the file was opened read/write. It is otherwise read-only.
len | length of line (returned via reference) | |
offs | offset to be passed for reading next line, should be 0 initially (returned via reference) | |
line_nr | used to indicate errors (returned as reference should be 0 initially) | |
file_length | total length of the file (for error checking) |
Definition at line 23 of file StringFileFeatures.cpp.
CMemoryMappedFile<ST>* file [protected] |
memory mapped file
Definition at line 86 of file StringFileFeatures.h.