Detailed Description

Hidden Markov Model.

Structure and Function collection. This Class implements a Hidden Markov Model. For a tutorial on HMMs see Rabiner et.al A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, 1989

Several functions for tasks such as training,reading/writing models, reading observations, calculation of derivatives are supplied.

Definition at line 365 of file HMM.h.

Inheritance diagram for CHMM:

[legend]

List of all members.

Public Member Functions

CHMM (void)

bool alloc_state_dependend_arrays ()

allocates memory that depends on N

void free_state_dependend_arrays ()

free memory that depends on N

bool linear_train (bool right_align=false)

estimates linear model from observations.

bool permutation_entropy (int32_t window_width, int32_t sequence_number)

compute permutation entropy

virtual const char * get_name () const

Constructor/Destructor and helper function

CHMM (int32_t N, int32_t M, Model *model, float64_t PSEUDO)

CHMM (CStringFeatures< uint16_t > *obs, int32_t N, int32_t M, float64_t PSEUDO)

CHMM (int32_t N, float64_t *p, float64_t *q, float64_t *a)

CHMM (int32_t N, float64_t *p, float64_t *q, int32_t num_trans, float64_t *a_trans)

CHMM (FILE *model_file, float64_t PSEUDO)

CHMM (CHMM *h)

Constructor - Clone model h.

virtual ~CHMM ()

Destructor - Cleanup.

virtual bool train (CFeatures *data=NULL)

virtual int32_t get_num_model_parameters ()

virtual float64_t get_log_model_parameter (int32_t num_param)

virtual float64_t get_log_derivative (int32_t num_param, int32_t num_example)

virtual float64_t get_log_likelihood_example (int32_t num_example)

bool initialize (Model *model, float64_t PSEUDO, FILE *model_file=NULL)

probability functions.

forward/backward/viterbi algorithm

float64_t forward_comp (int32_t time, int32_t state, int32_t dimension)

float64_t forward_comp_old (int32_t time, int32_t state, int32_t dimension)

float64_t backward_comp (int32_t time, int32_t state, int32_t dimension)

float64_t backward_comp_old (int32_t time, int32_t state, int32_t dimension)

float64_t best_path (int32_t dimension)

uint16_t get_best_path_state (int32_t dim, int32_t t)

float64_t model_probability_comp ()

float64_t model_probability (int32_t dimension=-1)

inline proxy for model probability.

float64_t linear_model_probability (int32_t dimension)

convergence criteria

bool set_iterations (int32_t num)

int32_t get_iterations ()

bool set_epsilon (float64_t eps)

float64_t get_epsilon ()

bool baum_welch_viterbi_train (BaumWelchViterbiType type)

model training

void estimate_model_baum_welch (CHMM *train)

void estimate_model_baum_welch_trans (CHMM *train)

void estimate_model_baum_welch_old (CHMM *train)

void estimate_model_baum_welch_defined (CHMM *train)

void estimate_model_viterbi (CHMM *train)

void estimate_model_viterbi_defined (CHMM *train)

output functions.

void output_model (bool verbose=false)

void output_model_defined (bool verbose=false)

performs output_model only for the defined transitions etc

model helper functions.

void normalize (bool keep_dead_states=false)

normalize the model to satisfy stochasticity

void add_states (int32_t num_states, float64_t default_val=0)

bool append_model (CHMM *append_model, float64_t *cur_out, float64_t *app_out)

bool append_model (CHMM *append_model)

void chop (float64_t value)

set any model parameter with probability smaller than value to ZERO

void convert_to_log ()

convert model to log probabilities

void init_model_random ()

init model with random values

void init_model_defined ()

void clear_model ()

initializes model with log(PSEUDO)

void clear_model_defined ()

initializes only parameters in learn_x with log(PSEUDO)

void copy_model (CHMM *l)

copies the the modelparameters from l

void invalidate_model ()

bool get_status () const

float64_t get_pseudo () const

returns current pseudo value

void set_pseudo (float64_t pseudo)

sets current pseudo value

void set_observations (CStringFeatures< uint16_t > *obs, CHMM *hmm=NULL)

void set_observation_nocache (CStringFeatures< uint16_t > *obs)

CStringFeatures< uint16_t > * get_observations ()

return observation pointer

load/save functions.

for observations/model/traindefinitions

bool load_definitions (FILE *file, bool verbose, bool initialize=true)

bool load_model (FILE *file)

bool save_model (FILE *file)

bool save_model_derivatives (FILE *file)

bool save_model_derivatives_bin (FILE *file)

bool save_model_bin (FILE *file)

bool check_model_derivatives ()

numerically check whether derivates were calculated right

bool check_model_derivatives_combined ()

T_STATES * get_path (int32_t dim, float64_t &prob)

bool save_path (FILE *file)

bool save_path_derivatives (FILE *file)

bool save_path_derivatives_bin (FILE *file)

bool save_likelihood_bin (FILE *file)

bool save_likelihood (FILE *file)

access functions for model parameters

for all the arrays a,b,p,q,A,B,psi and scalar model parameters like N,M

T_STATES get_N () const

access function for number of states N

int32_t get_M () const

access function for number of observations M

void set_q (T_STATES offset, float64_t value)

void set_p (T_STATES offset, float64_t value)

void set_A (T_STATES line_, T_STATES column, float64_t value)

void set_a (T_STATES line_, T_STATES column, float64_t value)

void set_B (T_STATES line_, uint16_t column, float64_t value)

void set_b (T_STATES line_, uint16_t column, float64_t value)

void set_psi (int32_t time, T_STATES state, T_STATES value, int32_t dimension)

float64_t get_q (T_STATES offset) const

float64_t get_p (T_STATES offset) const

float64_t get_A (T_STATES line_, T_STATES column) const

float64_t get_a (T_STATES line_, T_STATES column) const

float64_t get_B (T_STATES line_, uint16_t column) const

float64_t get_b (T_STATES line_, uint16_t column) const

T_STATES get_psi (int32_t time, T_STATES state, int32_t dimension) const

functions for observations

management and access functions for observation matrix

float64_t state_probability (int32_t time, int32_t state, int32_t dimension)

calculates probability of being in state i at time t for dimension

float64_t transition_probability (int32_t time, int32_t state_i, int32_t state_j, int32_t dimension)

calculates probability of being in state i at time t and state j at time t+1 for dimension

derivatives of model probabilities.

computes log dp(lambda)/d lambda_i

Parameters:

	dimension	dimension for that derivatives are calculated
	i,j	parameter specific

float64_t linear_model_derivative (T_STATES i, uint16_t j, int32_t dimension)

float64_t model_derivative_p (T_STATES i, int32_t dimension)

float64_t model_derivative_q (T_STATES i, int32_t dimension)

float64_t model_derivative_a (T_STATES i, T_STATES j, int32_t dimension)

computes log dp(lambda)/d a_ij.

float64_t model_derivative_b (T_STATES i, uint16_t j, int32_t dimension)

computes log dp(lambda)/d b_ij.

derivatives of path probabilities.

computes d log p(lambda,best_path)/d lambda_i

Parameters:

	dimension	dimension for that derivatives are calculated
	i,j	parameter specific

float64_t path_derivative_p (T_STATES i, int32_t dimension)

computes d log p(lambda,best_path)/d p_i

float64_t path_derivative_q (T_STATES i, int32_t dimension)

computes d log p(lambda,best_path)/d q_i

float64_t path_derivative_a (T_STATES i, T_STATES j, int32_t dimension)

computes d log p(lambda,best_path)/d a_ij

float64_t path_derivative_b (T_STATES i, uint16_t j, int32_t dimension)

computes d log p(lambda,best_path)/d b_ij

Protected Member Functions

void prepare_path_derivative (int32_t dim)

initialization function that is called before path_derivatives are calculated

float64_t forward (int32_t time, int32_t state, int32_t dimension)

inline proxies for forward pass

float64_t backward (int32_t time, int32_t state, int32_t dimension)

inline proxies for backward pass

input helper functions.

for reading model/definition/observation files

bool get_numbuffer (FILE *file, char *buffer, int32_t length)

put a sequence of numbers into the buffer

void open_bracket (FILE *file)

expect open bracket.

void close_bracket (FILE *file)

expect closing bracket

bool comma_or_space (FILE *file)

expect comma or space.

void error (int32_t p_line, const char *str)

parse error messages

Protected Attributes

float64_t * arrayN1

float64_t * arrayN2

T_ALPHA_BETA alpha_cache

cache for forward variables can be terrible HUGE O(T*N)

T_ALPHA_BETA beta_cache

cache for backward variables can be terrible HUGE O(T*N)

T_STATES * states_per_observation_psi

backtracking table for viterbi can be terrible HUGE O(T*N)

T_STATES * path

best path (=state sequence) through model

bool path_prob_updated

true if path probability is up to date

int32_t path_prob_dimension

dimension for which path_prob was calculated

model specific variables.

these are p,q,a,b,N,M etc

int32_t M

number of observation symbols eg. ACGT -> 0123

int32_t N

number of states

float64_t

PSEUDO

define pseudocounts against overfitting

int32_t line

CStringFeatures< uint16_t > * p_observations

observation matrix

Model * model

float64_t * transition_matrix_A

matrix of absolute counts of transitions

float64_t * observation_matrix_B

matrix of absolute counts of observations within each state

float64_t * transition_matrix_a

transition matrix

float64_t * initial_state_distribution_p

initial distribution of states

float64_t * end_state_distribution_q

distribution of end-states

float64_t * observation_matrix_b

distribution of observations within each state

int32_t iterations

convergence criterion iterations

int32_t iteration_count

float64_t

epsilon

convergence criterion epsilon

int32_t conv_it

float64_t

all_pat_prob

probability of best path

float64_t

pat_prob

probability of best path

float64_t

mod_prob

probability of model

bool mod_prob_updated

true if model probability is up to date

bool all_path_prob_updated

true if path probability is up to date

int32_t path_deriv_dimension

dimension for which path_deriv was calculated

bool path_deriv_updated

true if path derivative is up to date

bool loglikelihood

bool status

bool reused_caches

Static Protected Attributes

static const int32_t GOTN = (1<<1)

static const int32_t GOTM = (1<<2)

static const int32_t GOTO = (1<<3)

static const int32_t GOTa = (1<<4)

static const int32_t GOTb = (1<<5)

static const int32_t GOTp = (1<<6)

static const int32_t GOTq = (1<<7)

static const int32_t GOTlearn_a = (1<<1)

static const int32_t GOTlearn_b = (1<<2)

static const int32_t GOTlearn_p = (1<<3)

static const int32_t GOTlearn_q = (1<<4)

static const int32_t GOTconst_a = (1<<5)

static const int32_t GOTconst_b = (1<<6)

static const int32_t GOTconst_p = (1<<7)

static const int32_t GOTconst_q = (1<<8)

Constructor & Destructor Documentation

CHMM ( void )

Train definitions. Encapsulates Modelparameters that are constant/shall be learned. Consists of structures and access functions for learning only defined transitions and constants. default constructor

Definition at line 143 of file HMM.cpp.

CHMM	(	int32_t	N,
		int32_t	M,
		Model *	model,
		float64_t	PSEUDO
	)

Constructor

Parameters:

	N	number of states
	M	number of emissions
	model	model which holds definitions of states to be learned + consts
	PSEUDO	Pseudo Value

Definition at line 163 of file HMM.cpp.

CHMM	(	CStringFeatures< uint16_t > *	obs,
		int32_t	N,
		int32_t	M,
		float64_t	PSEUDO
	)

Definition at line 175 of file HMM.cpp.

CHMM	(	int32_t	N,
		float64_t *	p,
		float64_t *	q,
		float64_t *	a
	)

Definition at line 190 of file HMM.cpp.

CHMM	(	int32_t	N,
		float64_t *	p,
		float64_t *	q,
		int32_t	num_trans,
		float64_t *	a_trans
	)

Definition at line 242 of file HMM.cpp.

CHMM	(	FILE *	model_file,
		float64_t	PSEUDO
	)

Constructor - Initialization from model file.

Parameters:

	model_file	Filehandle to a hmm model file (*.mod)
	PSEUDO	Pseudo Value

Definition at line 354 of file HMM.cpp.

CHMM ( CHMM * h )

Constructor - Clone model h.

Definition at line 151 of file HMM.cpp.

~CHMM ( ) [virtual]

Destructor - Cleanup.

Definition at line 362 of file HMM.cpp.

Member Function Documentation

void add_states	(	int32_t	num_states,
		float64_t	default_val = `0`
	)

increases the number of states by num_states the new a/b/p/q values are given the value default_val where 0<=default_val<=1

Definition at line 5015 of file HMM.cpp.

bool alloc_state_dependend_arrays ( )

allocates memory that depends on N

Definition at line 458 of file HMM.cpp.

bool append_model	(	CHMM *	append_model,
		float64_t *	cur_out,
		float64_t *	app_out
	)

appends the append_model to the current hmm, i.e. two extra states are created. one is the end state of the current hmm with outputs cur_out (of size M) and the other state is the start state of the append_model. transition probability from state 1 to states 1 is 1

Definition at line 4907 of file HMM.cpp.

bool append_model ( CHMM * append_model )

appends the append_model to the current hmm, here no extra states are created. former q_i are multiplied by q_ji to give the a_ij from the current hmm to the append_model

Definition at line 4815 of file HMM.cpp.

float64_t backward	(	int32_t	time,
		int32_t	state,
		int32_t	dimension
	)			`[protected]`

inline proxies for backward pass

Definition at line 1556 of file HMM.h.

float64_t backward_comp	(	int32_t	time,
		int32_t	state,
		int32_t	dimension
	)

backward algorithm. calculates Pr[O_t+1,O_t+2, ..., O_T-1| q_time=S_i, lambda] for 0<= time <= T-1 Pr[O|lambda] for time >= T

Parameters:

	time	t
	state	i
	dimension	dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 875 of file HMM.cpp.

float64_t backward_comp_old	(	int32_t	time,
		int32_t	state,
		int32_t	dimension
	)

Definition at line 974 of file HMM.cpp.

bool baum_welch_viterbi_train ( BaumWelchViterbiType type )

interface for e.g. GUIHMM to run BaumWelch or Viterbi training

Parameters:

type

type of BaumWelch/Viterbi training

Definition at line 5532 of file HMM.cpp.

float64_t best_path ( int32_t dimension )

calculates probability of best state sequence s_0,...,s_T-1 AND path itself using viterbi algorithm. The path can be found in the array PATH(dimension)[0..T-1] afterwards

Parameters:

dimension

dimension of observation for which the most probable path is calculated (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 1106 of file HMM.cpp.

bool check_model_derivatives ( )

numerically check whether derivates were calculated right

Definition at line 4572 of file HMM.cpp.

bool check_model_derivatives_combined ( )

Definition at line 4502 of file HMM.cpp.

void chop ( float64_t value )

set any model parameter with probability smaller than value to ZERO

Definition at line 5075 of file HMM.cpp.

void clear_model ( )

initializes model with log(PSEUDO)

Definition at line 2614 of file HMM.cpp.

void clear_model_defined ( )

initializes only parameters in learn_x with log(PSEUDO)

Definition at line 2630 of file HMM.cpp.

void close_bracket ( FILE * file ) [protected]

expect closing bracket

Definition at line 2777 of file HMM.cpp.

bool comma_or_space ( FILE * file ) [protected]

expect comma or space.

Definition at line 2790 of file HMM.cpp.

void convert_to_log ( )

convert model to log probabilities

Definition at line 2347 of file HMM.cpp.

void copy_model ( CHMM * l )

copies the the modelparameters from l

Definition at line 2653 of file HMM.cpp.

void error	(	int32_t	p_line,
		const char *	str
	)			`[protected]`

parse error messages

Definition at line 1501 of file HMM.h.

void estimate_model_baum_welch ( CHMM * train )

uses baum-welch-algorithm to train a fully connected HMM.

Parameters:

train

model from which the new model is estimated

Definition at line 1482 of file HMM.cpp.

void estimate_model_baum_welch_defined ( CHMM * train )

uses baum-welch-algorithm to train the defined transitions etc.

Parameters:

train

model from which the new model is estimated

Definition at line 1723 of file HMM.cpp.

void estimate_model_baum_welch_old ( CHMM * train )

Definition at line 1568 of file HMM.cpp.

void estimate_model_baum_welch_trans ( CHMM * train )

Definition at line 1653 of file HMM.cpp.

void estimate_model_viterbi ( CHMM * train )

uses viterbi training to train a fully connected HMM

Parameters:

train

model from which the new model is estimated

Definition at line 1899 of file HMM.cpp.

void estimate_model_viterbi_defined ( CHMM * train )

uses viterbi training to train the defined transitions etc.

Parameters:

train

model from which the new model is estimated

Definition at line 2026 of file HMM.cpp.

float64_t forward	(	int32_t	time,
		int32_t	state,
		int32_t	dimension
	)			`[protected]`

inline proxies for forward pass

Definition at line 1539 of file HMM.h.

float64_t forward_comp	(	int32_t	time,
		int32_t	state,
		int32_t	dimension
	)

forward algorithm. calculates Pr[O_0,O_1, ..., O_t, q_time=S_i| lambda] for 0<= time <= T-1 Pr[O|lambda] for time > T

Parameters:

	time	t
	state	i
	dimension	dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 639 of file HMM.cpp.

float64_t forward_comp_old	(	int32_t	time,
		int32_t	state,
		int32_t	dimension
	)

Definition at line 743 of file HMM.cpp.

void free_state_dependend_arrays ( )

free memory that depends on N

Definition at line 515 of file HMM.cpp.

float64_t get_A	(	T_STATES	line_,
		T_STATES	column
	)			const

access function for matrix A

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...N-1

Returns:: value at position line colum

Definition at line 1111 of file HMM.h.

float64_t get_a	(	T_STATES	line_,
		T_STATES	column
	)			const

access function for matrix a

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...N-1

Returns:: value at position line colum

Definition at line 1125 of file HMM.h.

float64_t get_B	(	T_STATES	line_,
		uint16_t	column
	)			const

access function for matrix B

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...M-1

Returns:: value at position line colum

Definition at line 1139 of file HMM.h.

float64_t get_b	(	T_STATES	line_,
		uint16_t	column
	)			const

access function for matrix b

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...M-1

Returns:: value at position line colum

Definition at line 1153 of file HMM.h.

uint16_t get_best_path_state	(	int32_t	dim,
		int32_t	t
	)

Definition at line 559 of file HMM.h.

float64_t get_epsilon ( )

Definition at line 624 of file HMM.h.

int32_t get_iterations ( )

Definition at line 622 of file HMM.h.

float64_t get_log_derivative	(	int32_t	num_param,
		int32_t	num_example
	)			`[virtual]`

get partial derivative of likelihood function (logarithmic)

abstract base method

Parameters:

	num_param	derivative against which param
	num_example	which example

Returns:: derivative of likelihood (logarithmic)

Implements CDistribution.

Definition at line 5465 of file HMM.cpp.

virtual float64_t get_log_likelihood_example ( int32_t num_example ) [virtual]

compute log likelihood for example

abstract base method

Parameters:

num_example

which example

Returns:: log likelihood for example

Implements CDistribution.

Definition at line 509 of file HMM.h.

float64_t get_log_model_parameter ( int32_t num_param ) [virtual]

get model parameter (logarithmic)

abstrac base method

Returns:: model parameter (logarithmic)

Implements CDistribution.

Definition at line 5490 of file HMM.cpp.

int32_t get_M ( ) const

access function for number of observations M

Definition at line 980 of file HMM.h.

T_STATES get_N ( ) const

access function for number of states N

Definition at line 977 of file HMM.h.

virtual const char* get_name ( void ) const [virtual]

Returns:: object name

Implements CSGObject.

Definition at line 1182 of file HMM.h.

virtual int32_t get_num_model_parameters ( ) [virtual]

get number of parameters in model

abstract base method

Returns:: number of parameters in model

Implements CDistribution.

Definition at line 506 of file HMM.h.

bool get_numbuffer	(	FILE *	file,
		char *	buffer,
		int32_t	length
	)			`[protected]`

put a sequence of numbers into the buffer

Definition at line 2817 of file HMM.cpp.

CStringFeatures<uint16_t>* get_observations ( )

return observation pointer

Definition at line 795 of file HMM.h.

float64_t get_p ( T_STATES offset ) const

access function for probability of initial states

Parameters:

offset

index 0...N-1

Returns:: value at offset

Definition at line 1097 of file HMM.h.

T_STATES * get_path	(	int32_t	dim,
		float64_t &	prob
	)

get viterbi path and path probability

Parameters:

	dim	dimension for which to obtain best path
	prob	likelihood of path

Returns:: viterbi path

Definition at line 4025 of file HMM.cpp.

float64_t get_pseudo ( ) const

returns current pseudo value

Definition at line 748 of file HMM.h.

T_STATES get_psi	(	int32_t	time,
		T_STATES	state,
		int32_t	dimension
	)			const

access function for backtracking table psi

Parameters:

	time	time 0...T-1
	state	state 0...N-1
	dimension	dimension of observations 0...DIMENSION-1

Returns:: state at specified time and position

Definition at line 1169 of file HMM.h.

float64_t get_q ( T_STATES offset ) const

access function for probability of end states

Parameters:

offset

index 0...N-1

Returns:: value at offset

Definition at line 1084 of file HMM.h.

bool get_status ( ) const

get status

Returns:: true if everything is ok, else false

Definition at line 742 of file HMM.h.

void init_model_defined ( )

init model according to const_x, learn_x. first model is initialized with 0 for all parameters then parameters in learn_x are initialized with random values finally const_x parameters are set and model is normalized.

Definition at line 2460 of file HMM.cpp.

void init_model_random ( )

init model with random values

Definition at line 2394 of file HMM.cpp.

bool initialize	(	Model *	model,
		float64_t	PSEUDO,
		FILE *	model_file = `NULL`
	)

initialization function - gets called by constructors.

Parameters:

	model	model which holds definitions of states to be learned + consts
	PSEUDO	Pseudo Value
	model_file	Filehandle to a hmm model file (*.mod)

Definition at line 550 of file HMM.cpp.

void invalidate_model ( )

invalidates all caches. this function has to be called when direct changes to the model have been made. this is necessary for the forward/backward/viterbi algorithms to not work with old tables

Definition at line 2669 of file HMM.cpp.

float64_t linear_model_derivative	(	T_STATES	i,
		uint16_t	j,
		int32_t	dimension
	)

computes log dp(lambda)/d b_ij for linear model

Definition at line 1389 of file HMM.h.

float64_t linear_model_probability ( int32_t dimension )

calculates likelihood for linear model on observations in MEMORY

Parameters:

dimension

dimension for which probability is calculated

Returns:: model probability

Definition at line 589 of file HMM.h.

bool linear_train ( bool right_align = false )

estimates linear model from observations.

Definition at line 5103 of file HMM.cpp.

bool load_definitions	(	FILE *	file,
		bool	verbose,
		bool	initialize = `true`
	)

read definitions file (learn_x,const_x) used for training. -format specs: definition_file (train.def) % HMM-TRAIN - specification % learn_a - elements in state_transition_matrix to be learned % learn_b - elements in oberservation_per_state_matrix to be learned % note: each line stands for % state, observation(0), observation(1)...observation(NOW) % learn_p - elements in initial distribution to be learned % learn_q - elements in the end-state distribution to be learned % % const_x - specifies initial values of elements % rest is assumed to be 0.0 % % NOTE: IMPLICIT DEFINES: % define A 0 % define C 1 % define G 2 % define T 3

learn_a=[ [int32_t,int32_t]; [int32_t,int32_t]; [int32_t,int32_t]; ........ [int32_t,int32_t]; [-1,-1]; ];

learn_b=[ [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; ........ [int32_t,int32_t,int32_t,...,int32_t]; [-1,-1]; ];

learn_p= [ int32_t, ... , int32_t, -1 ];

learn_q= [ int32_t, ... , int32_t, -1 ];

const_a=[ [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; ........ [int32_t,int32_t,float64_t]; [-1,-1,-1]; ];

const_b=[ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,<DOUBLE]; ........ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [-1,-1,-1]; ];

const_p[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ]; const_q[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ];

Parameters:

	file	filehandle to definitions file
	verbose	true for verbose messages
	initialize	true to initialize to underlying HMM

Definition at line 3224 of file HMM.cpp.

bool load_model ( FILE * file )

read model from file. -format specs: model_file (model.hmm) % HMM - specification % N - number of states % M - number of observation_tokens % a is state_transition_matrix % size(a)= [N,N] % % b is observation_per_state_matrix % size(b)= [N,M] % % p is initial distribution % size(p)= [1, N]

N=int32_t; M=int32_t;

p=[float64_t,float64_t...float64_t]; q=[float64_t,float64_t...float64_t];

a=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

b=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

Parameters:

file

filehandle to model file

Definition at line 2926 of file HMM.cpp.

float64_t model_derivative_a	(	T_STATES	i,
		T_STATES	j,
		int32_t	dimension
	)

computes log dp(lambda)/d a_ij.

Definition at line 1420 of file HMM.h.

float64_t model_derivative_b	(	T_STATES	i,
		uint16_t	j,
		int32_t	dimension
	)

computes log dp(lambda)/d b_ij.

Definition at line 1431 of file HMM.h.

float64_t model_derivative_p	(	T_STATES	i,
		int32_t	dimension
	)

computes log dp(lambda)/d p_i. backward path downto time 0 multiplied by observing first symbol in path at state i

Definition at line 1406 of file HMM.h.

float64_t model_derivative_q	(	T_STATES	i,
		int32_t	dimension
	)

computes log dp(lambda)/d q_i. forward path upto time T-1

Definition at line 1414 of file HMM.h.

float64_t model_probability ( int32_t dimension = -1 )

inline proxy for model probability.

Definition at line 570 of file HMM.h.

float64_t model_probability_comp ( )

calculates probability that observations were generated by the model using forward algorithm.

Definition at line 1234 of file HMM.cpp.

void normalize ( bool keep_dead_states = false )

normalize the model to satisfy stochasticity

Definition at line 4780 of file HMM.cpp.

void open_bracket ( FILE * file ) [protected]

expect open bracket.

Definition at line 2756 of file HMM.cpp.

void output_model ( bool verbose = false )

prints the model parameters on screen.

Parameters:

verbose

when false only the model probability will be printed when true the whole model will be printed additionally

Definition at line 2208 of file HMM.cpp.

void output_model_defined ( bool verbose = false )

performs output_model only for the defined transitions etc

Definition at line 2292 of file HMM.cpp.

float64_t path_derivative_a	(	T_STATES	i,
		T_STATES	j,
		int32_t	dimension
	)

computes d log p(lambda,best_path)/d a_ij

Definition at line 1467 of file HMM.h.

float64_t path_derivative_b	(	T_STATES	i,
		uint16_t	j,
		int32_t	dimension
	)

computes d log p(lambda,best_path)/d b_ij

Definition at line 1474 of file HMM.h.

float64_t path_derivative_p	(	T_STATES	i,
		int32_t	dimension
	)

computes d log p(lambda,best_path)/d p_i

Definition at line 1453 of file HMM.h.

float64_t path_derivative_q	(	T_STATES	i,
		int32_t	dimension
	)

computes d log p(lambda,best_path)/d q_i

Definition at line 1460 of file HMM.h.

bool permutation_entropy	(	int32_t	window_width,
		int32_t	sequence_number
	)

compute permutation entropy

Definition at line 5407 of file HMM.cpp.

void prepare_path_derivative ( int32_t dim ) [protected]

initialization function that is called before path_derivatives are calculated

Definition at line 1511 of file HMM.h.

bool save_likelihood ( FILE * file )

save model probability in ascii format

Parameters:

file

filehandle

Definition at line 4079 of file HMM.cpp.

bool save_likelihood_bin ( FILE * file )

save model probability in binary format

Parameters:

file

filehandle

Definition at line 4062 of file HMM.cpp.

bool save_model ( FILE * file )

save model to file.

Parameters:

file

filehandle to model file

Definition at line 3929 of file HMM.cpp.

bool save_model_bin ( FILE * file )

save model in binary format.

Parameters:

file

filehandle

Definition at line 4100 of file HMM.cpp.

bool save_model_derivatives ( FILE * file )

save model derivatives to file in ascii format.

Parameters:

file

filehandle

Definition at line 4454 of file HMM.cpp.

bool save_model_derivatives_bin ( FILE * file )

save model derivatives to file in binary format.

Parameters:

file

filehandle

Definition at line 4333 of file HMM.cpp.

bool save_path ( FILE * file )

save viterbi path in ascii format

Parameters:

file

filehandle

Definition at line 4038 of file HMM.cpp.

bool save_path_derivatives ( FILE * file )

save viterbi path in ascii format

Parameters:

file

filehandle

Definition at line 4202 of file HMM.cpp.

bool save_path_derivatives_bin ( FILE * file )

save viterbi path in binary format

Parameters:

file

filehandle

Definition at line 4250 of file HMM.cpp.

void set_a	(	T_STATES	line_,
		T_STATES	column,
		float64_t	value
	)

access function for matrix a

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...N-1
	value	value to be set

Definition at line 1027 of file HMM.h.

void set_A	(	T_STATES	line_,
		T_STATES	column,
		float64_t	value
	)

access function for matrix A

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...N-1
	value	value to be set

Definition at line 1013 of file HMM.h.

void set_B	(	T_STATES	line_,
		uint16_t	column,
		float64_t	value
	)

access function for matrix B

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...M-1
	value	value to be set

Definition at line 1041 of file HMM.h.

void set_b	(	T_STATES	line_,
		uint16_t	column,
		float64_t	value
	)

access function for matrix b

Parameters:

	line_	row in matrix 0...N-1
	column	column in matrix 0...M-1
	value	value to be set

Definition at line 1055 of file HMM.h.

bool set_epsilon ( float64_t eps )

Definition at line 623 of file HMM.h.

bool set_iterations ( int32_t num )

Definition at line 621 of file HMM.h.

void set_observation_nocache ( CStringFeatures< uint16_t > * obs )

set new observations only set the observation pointer and drop caches if there were any

Definition at line 5220 of file HMM.cpp.

void set_observations	(	CStringFeatures< uint16_t > *	obs,
		CHMM *	hmm = `NULL`
	)

observation functions set/get observation matrix set new observations sets the observation pointer and initializes observation-dependent caches if hmm is given, then the caches of the model hmm are used

Definition at line 5262 of file HMM.cpp.

void set_p	(	T_STATES	offset,
		float64_t	value
	)

access function for probability of first state

Parameters:

	offset	index 0...N-1
	value	value to be set

Definition at line 999 of file HMM.h.

void set_pseudo ( float64_t pseudo )

sets current pseudo value

Definition at line 754 of file HMM.h.

void set_psi	(	int32_t	time,
		T_STATES	state,
		T_STATES	value,
		int32_t	dimension
	)

access function for backtracking table psi

Parameters:

	time	time 0...T-1
	state	state 0...N-1
	value	value to be set
	dimension	dimension of observations 0...DIMENSION-1

Definition at line 1070 of file HMM.h.

void set_q	(	T_STATES	offset,
		float64_t	value
	)

access function for probability of end states

Parameters:

	offset	index 0...N-1
	value	value to be set

Definition at line 986 of file HMM.h.

float64_t state_probability	(	int32_t	time,
		int32_t	state,
		int32_t	dimension
	)

calculates probability of being in state i at time t for dimension

Definition at line 1365 of file HMM.h.

bool train ( CFeatures * data = NULL ) [virtual]

learn distribution

Parameters:

data

training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data)

Returns:: whether training was successful

Implements CDistribution.

Definition at line 444 of file HMM.cpp.

float64_t transition_probability	(	int32_t	time,
		int32_t	state_i,
		int32_t	state_j,
		int32_t	dimension
	)

calculates probability of being in state i at time t and state j at time t+1 for dimension

Definition at line 1372 of file HMM.h.

Member Data Documentation

float64_t all_pat_prob [protected]

probability of best path

Definition at line 1234 of file HMM.h.

bool all_path_prob_updated [protected]

true if path probability is up to date

Definition at line 1246 of file HMM.h.

T_ALPHA_BETA alpha_cache [protected]

cache for forward variables can be terrible HUGE O(T*N)

Definition at line 1307 of file HMM.h.

float64_t* arrayN1 [protected]

array of size N for temporary calculations

Definition at line 1271 of file HMM.h.

float64_t* arrayN2 [protected]

array of size N for temporary calculations

Definition at line 1273 of file HMM.h.

T_ALPHA_BETA beta_cache [protected]

cache for backward variables can be terrible HUGE O(T*N)

Definition at line 1309 of file HMM.h.

int32_t conv_it [protected]

Definition at line 1231 of file HMM.h.

float64_t* end_state_distribution_q [protected]

distribution of end-states

Definition at line 1220 of file HMM.h.

float64_t epsilon [protected]

convergence criterion epsilon

Definition at line 1230 of file HMM.h.

const int32_t GOTa = (1<<4) [static, protected]

GOTa

Definition at line 1333 of file HMM.h.

const int32_t GOTb = (1<<5) [static, protected]

GOTb

Definition at line 1335 of file HMM.h.

const int32_t GOTconst_a = (1<<5) [static, protected]

GOTconst_a

Definition at line 1350 of file HMM.h.

const int32_t GOTconst_b = (1<<6) [static, protected]

GOTconst_b

Definition at line 1352 of file HMM.h.

const int32_t GOTconst_p = (1<<7) [static, protected]

GOTconst_p

Definition at line 1354 of file HMM.h.

const int32_t GOTconst_q = (1<<8) [static, protected]

GOTconst_q

Definition at line 1356 of file HMM.h.

const int32_t GOTlearn_a = (1<<1) [static, protected]

GOTlearn_a

Definition at line 1342 of file HMM.h.

const int32_t GOTlearn_b = (1<<2) [static, protected]

GOTlearn_b

Definition at line 1344 of file HMM.h.

const int32_t GOTlearn_p = (1<<3) [static, protected]

GOTlearn_p

Definition at line 1346 of file HMM.h.

const int32_t GOTlearn_q = (1<<4) [static, protected]

GOTlearn_q

Definition at line 1348 of file HMM.h.

const int32_t GOTM = (1<<2) [static, protected]

GOTM

Definition at line 1329 of file HMM.h.

const int32_t GOTN = (1<<1) [static, protected]

GOTN

Definition at line 1327 of file HMM.h.

const int32_t GOTO = (1<<3) [static, protected]

GOTO

Definition at line 1331 of file HMM.h.

const int32_t GOTp = (1<<6) [static, protected]

GOTp

Definition at line 1337 of file HMM.h.

const int32_t GOTq = (1<<7) [static, protected]

GOTq

Definition at line 1339 of file HMM.h.

float64_t* initial_state_distribution_p [protected]

initial distribution of states

Definition at line 1217 of file HMM.h.

int32_t iteration_count [protected]

Definition at line 1227 of file HMM.h.

int32_t iterations [protected]

convergence criterion iterations

Definition at line 1226 of file HMM.h.

int32_t line [protected]

Definition at line 1199 of file HMM.h.

bool loglikelihood [protected]

Definition at line 1255 of file HMM.h.

int32_t M [protected]

number of observation symbols eg. ACGT -> 0123

Definition at line 1190 of file HMM.h.

float64_t mod_prob [protected]

probability of model

Definition at line 1240 of file HMM.h.

bool mod_prob_updated [protected]

true if model probability is up to date

Definition at line 1243 of file HMM.h.

Model* model [protected]

Definition at line 1205 of file HMM.h.

int32_t N [protected]

number of states

Definition at line 1193 of file HMM.h.

float64_t* observation_matrix_B [protected]

matrix of absolute counts of observations within each state

Definition at line 1211 of file HMM.h.

float64_t* observation_matrix_b [protected]

distribution of observations within each state

Definition at line 1223 of file HMM.h.

CStringFeatures<uint16_t>* p_observations [protected]

observation matrix

Definition at line 1202 of file HMM.h.

float64_t pat_prob [protected]

probability of best path

Definition at line 1237 of file HMM.h.

T_STATES* path [protected]

best path (=state sequence) through model

Definition at line 1315 of file HMM.h.

int32_t path_deriv_dimension [protected]

dimension for which path_deriv was calculated

Definition at line 1249 of file HMM.h.

bool path_deriv_updated [protected]

true if path derivative is up to date

Definition at line 1252 of file HMM.h.

int32_t path_prob_dimension [protected]

dimension for which path_prob was calculated

Definition at line 1321 of file HMM.h.

bool path_prob_updated [protected]

true if path probability is up to date

Definition at line 1318 of file HMM.h.

float64_t PSEUDO [protected]

define pseudocounts against overfitting

Definition at line 1196 of file HMM.h.

bool reused_caches [protected]

Definition at line 1261 of file HMM.h.

T_STATES* states_per_observation_psi [protected]

backtracking table for viterbi can be terrible HUGE O(T*N)

Definition at line 1312 of file HMM.h.

bool status [protected]

Definition at line 1258 of file HMM.h.

float64_t* transition_matrix_a [protected]

transition matrix

Definition at line 1214 of file HMM.h.

float64_t* transition_matrix_A [protected]

matrix of absolute counts of transitions

Definition at line 1208 of file HMM.h.

The documentation for this class was generated from the following files:

CHMM Class Reference

Detailed Description

Public Member Functions

Protected Member Functions

Protected Attributes

Static Protected Attributes

Constructor & Destructor Documentation

Member Function Documentation

Member Data Documentation