Public Member Functions | Protected Member Functions | Protected Attributes | Static Protected Attributes

CHMM Class Reference


Detailed Description

Hidden Markov Model.

Structure and Function collection. This Class implements a Hidden Markov Model. For a tutorial on HMMs see Rabiner et.al A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, 1989

Several functions for tasks such as training,reading/writing models, reading observations, calculation of derivatives are supplied.

Definition at line 365 of file HMM.h.

Inheritance diagram for CHMM:
Inheritance graph
[legend]

List of all members.

Public Member Functions

 CHMM (void)
bool alloc_state_dependend_arrays ()
 allocates memory that depends on N
void free_state_dependend_arrays ()
 free memory that depends on N
bool linear_train (bool right_align=false)
 estimates linear model from observations.
bool permutation_entropy (int32_t window_width, int32_t sequence_number)
 compute permutation entropy
virtual const char * get_name () const
Constructor/Destructor and helper function

 CHMM (int32_t N, int32_t M, Model *model, float64_t PSEUDO)
 CHMM (CStringFeatures< uint16_t > *obs, int32_t N, int32_t M, float64_t PSEUDO)
 CHMM (int32_t N, float64_t *p, float64_t *q, float64_t *a)
 CHMM (int32_t N, float64_t *p, float64_t *q, int32_t num_trans, float64_t *a_trans)
 CHMM (FILE *model_file, float64_t PSEUDO)
 CHMM (CHMM *h)
 Constructor - Clone model h.
virtual ~CHMM ()
 Destructor - Cleanup.
virtual bool train (CFeatures *data=NULL)
virtual int32_t get_num_model_parameters ()
virtual float64_t get_log_model_parameter (int32_t num_param)
virtual float64_t get_log_derivative (int32_t num_param, int32_t num_example)
virtual float64_t get_log_likelihood_example (int32_t num_example)
bool initialize (Model *model, float64_t PSEUDO, FILE *model_file=NULL)
probability functions.

forward/backward/viterbi algorithm

float64_t forward_comp (int32_t time, int32_t state, int32_t dimension)
float64_t forward_comp_old (int32_t time, int32_t state, int32_t dimension)
float64_t backward_comp (int32_t time, int32_t state, int32_t dimension)
float64_t backward_comp_old (int32_t time, int32_t state, int32_t dimension)
float64_t best_path (int32_t dimension)
uint16_t get_best_path_state (int32_t dim, int32_t t)
float64_t model_probability_comp ()
float64_t model_probability (int32_t dimension=-1)
 inline proxy for model probability.
float64_t linear_model_probability (int32_t dimension)
convergence criteria

bool set_iterations (int32_t num)
int32_t get_iterations ()
bool set_epsilon (float64_t eps)
float64_t get_epsilon ()
bool baum_welch_viterbi_train (BaumWelchViterbiType type)
model training

void estimate_model_baum_welch (CHMM *train)
void estimate_model_baum_welch_trans (CHMM *train)
void estimate_model_baum_welch_old (CHMM *train)
void estimate_model_baum_welch_defined (CHMM *train)
void estimate_model_viterbi (CHMM *train)
void estimate_model_viterbi_defined (CHMM *train)
output functions.

void output_model (bool verbose=false)
void output_model_defined (bool verbose=false)
 performs output_model only for the defined transitions etc
model helper functions.

void normalize (bool keep_dead_states=false)
 normalize the model to satisfy stochasticity
void add_states (int32_t num_states, float64_t default_val=0)
bool append_model (CHMM *append_model, float64_t *cur_out, float64_t *app_out)
bool append_model (CHMM *append_model)
void chop (float64_t value)
 set any model parameter with probability smaller than value to ZERO
void convert_to_log ()
 convert model to log probabilities
void init_model_random ()
 init model with random values
void init_model_defined ()
void clear_model ()
 initializes model with log(PSEUDO)
void clear_model_defined ()
 initializes only parameters in learn_x with log(PSEUDO)
void copy_model (CHMM *l)
 copies the the modelparameters from l
void invalidate_model ()
bool get_status () const
float64_t get_pseudo () const
 returns current pseudo value
void set_pseudo (float64_t pseudo)
 sets current pseudo value

void set_observations (CStringFeatures< uint16_t > *obs, CHMM *hmm=NULL)
void set_observation_nocache (CStringFeatures< uint16_t > *obs)
CStringFeatures< uint16_t > * get_observations ()
 return observation pointer
load/save functions.

for observations/model/traindefinitions

bool load_definitions (FILE *file, bool verbose, bool initialize=true)
bool load_model (FILE *file)
bool save_model (FILE *file)
bool save_model_derivatives (FILE *file)
bool save_model_derivatives_bin (FILE *file)
bool save_model_bin (FILE *file)
bool check_model_derivatives ()
 numerically check whether derivates were calculated right
bool check_model_derivatives_combined ()
T_STATESget_path (int32_t dim, float64_t &prob)
bool save_path (FILE *file)
bool save_path_derivatives (FILE *file)
bool save_path_derivatives_bin (FILE *file)
bool save_likelihood_bin (FILE *file)
bool save_likelihood (FILE *file)
access functions for model parameters

for all the arrays a,b,p,q,A,B,psi and scalar model parameters like N,M

T_STATES get_N () const
 access function for number of states N
int32_t get_M () const
 access function for number of observations M
void set_q (T_STATES offset, float64_t value)
void set_p (T_STATES offset, float64_t value)
void set_A (T_STATES line_, T_STATES column, float64_t value)
void set_a (T_STATES line_, T_STATES column, float64_t value)
void set_B (T_STATES line_, uint16_t column, float64_t value)
void set_b (T_STATES line_, uint16_t column, float64_t value)
void set_psi (int32_t time, T_STATES state, T_STATES value, int32_t dimension)
float64_t get_q (T_STATES offset) const
float64_t get_p (T_STATES offset) const
float64_t get_A (T_STATES line_, T_STATES column) const
float64_t get_a (T_STATES line_, T_STATES column) const
float64_t get_B (T_STATES line_, uint16_t column) const
float64_t get_b (T_STATES line_, uint16_t column) const
T_STATES get_psi (int32_t time, T_STATES state, int32_t dimension) const
functions for observations

management and access functions for observation matrix

float64_t state_probability (int32_t time, int32_t state, int32_t dimension)
 calculates probability of being in state i at time t for dimension
float64_t transition_probability (int32_t time, int32_t state_i, int32_t state_j, int32_t dimension)
 calculates probability of being in state i at time t and state j at time t+1 for dimension
derivatives of model probabilities.

computes log dp(lambda)/d lambda_i

Parameters:
dimension dimension for that derivatives are calculated
i,j parameter specific
float64_t linear_model_derivative (T_STATES i, uint16_t j, int32_t dimension)
float64_t model_derivative_p (T_STATES i, int32_t dimension)
float64_t model_derivative_q (T_STATES i, int32_t dimension)
float64_t model_derivative_a (T_STATES i, T_STATES j, int32_t dimension)
 computes log dp(lambda)/d a_ij.
float64_t model_derivative_b (T_STATES i, uint16_t j, int32_t dimension)
 computes log dp(lambda)/d b_ij.
derivatives of path probabilities.

computes d log p(lambda,best_path)/d lambda_i

Parameters:
dimension dimension for that derivatives are calculated
i,j parameter specific
float64_t path_derivative_p (T_STATES i, int32_t dimension)
 computes d log p(lambda,best_path)/d p_i
float64_t path_derivative_q (T_STATES i, int32_t dimension)
 computes d log p(lambda,best_path)/d q_i
float64_t path_derivative_a (T_STATES i, T_STATES j, int32_t dimension)
 computes d log p(lambda,best_path)/d a_ij
float64_t path_derivative_b (T_STATES i, uint16_t j, int32_t dimension)
 computes d log p(lambda,best_path)/d b_ij

Protected Member Functions

void prepare_path_derivative (int32_t dim)
 initialization function that is called before path_derivatives are calculated
float64_t forward (int32_t time, int32_t state, int32_t dimension)
 inline proxies for forward pass
float64_t backward (int32_t time, int32_t state, int32_t dimension)
 inline proxies for backward pass
input helper functions.

for reading model/definition/observation files

bool get_numbuffer (FILE *file, char *buffer, int32_t length)
 put a sequence of numbers into the buffer
void open_bracket (FILE *file)
 expect open bracket.
void close_bracket (FILE *file)
 expect closing bracket
bool comma_or_space (FILE *file)
 expect comma or space.
void error (int32_t p_line, const char *str)
 parse error messages

Protected Attributes

float64_tarrayN1
float64_tarrayN2
T_ALPHA_BETA alpha_cache
 cache for forward variables can be terrible HUGE O(T*N)
T_ALPHA_BETA beta_cache
 cache for backward variables can be terrible HUGE O(T*N)
T_STATESstates_per_observation_psi
 backtracking table for viterbi can be terrible HUGE O(T*N)
T_STATESpath
 best path (=state sequence) through model
bool path_prob_updated
 true if path probability is up to date
int32_t path_prob_dimension
 dimension for which path_prob was calculated
model specific variables.

these are p,q,a,b,N,M etc

int32_t M
 number of observation symbols eg. ACGT -> 0123
int32_t N
 number of states
float64_t PSEUDO
 define pseudocounts against overfitting
int32_t line
CStringFeatures< uint16_t > * p_observations
 observation matrix
Modelmodel
float64_ttransition_matrix_A
 matrix of absolute counts of transitions
float64_tobservation_matrix_B
 matrix of absolute counts of observations within each state
float64_ttransition_matrix_a
 transition matrix
float64_tinitial_state_distribution_p
 initial distribution of states
float64_tend_state_distribution_q
 distribution of end-states
float64_tobservation_matrix_b
 distribution of observations within each state
int32_t iterations
 convergence criterion iterations
int32_t iteration_count
float64_t epsilon
 convergence criterion epsilon
int32_t conv_it
float64_t all_pat_prob
 probability of best path
float64_t pat_prob
 probability of best path
float64_t mod_prob
 probability of model
bool mod_prob_updated
 true if model probability is up to date
bool all_path_prob_updated
 true if path probability is up to date
int32_t path_deriv_dimension
 dimension for which path_deriv was calculated
bool path_deriv_updated
 true if path derivative is up to date
bool loglikelihood
bool status
bool reused_caches

Static Protected Attributes

static const int32_t GOTN = (1<<1)
static const int32_t GOTM = (1<<2)
static const int32_t GOTO = (1<<3)
static const int32_t GOTa = (1<<4)
static const int32_t GOTb = (1<<5)
static const int32_t GOTp = (1<<6)
static const int32_t GOTq = (1<<7)
static const int32_t GOTlearn_a = (1<<1)
static const int32_t GOTlearn_b = (1<<2)
static const int32_t GOTlearn_p = (1<<3)
static const int32_t GOTlearn_q = (1<<4)
static const int32_t GOTconst_a = (1<<5)
static const int32_t GOTconst_b = (1<<6)
static const int32_t GOTconst_p = (1<<7)
static const int32_t GOTconst_q = (1<<8)

Constructor & Destructor Documentation

CHMM ( void   ) 

Train definitions. Encapsulates Modelparameters that are constant/shall be learned. Consists of structures and access functions for learning only defined transitions and constants. default constructor

Definition at line 143 of file HMM.cpp.

CHMM ( int32_t  N,
int32_t  M,
Model model,
float64_t  PSEUDO 
)

Constructor

Parameters:
N number of states
M number of emissions
model model which holds definitions of states to be learned + consts
PSEUDO Pseudo Value

Definition at line 163 of file HMM.cpp.

CHMM ( CStringFeatures< uint16_t > *  obs,
int32_t  N,
int32_t  M,
float64_t  PSEUDO 
)

Definition at line 175 of file HMM.cpp.

CHMM ( int32_t  N,
float64_t p,
float64_t q,
float64_t a 
)

Definition at line 190 of file HMM.cpp.

CHMM ( int32_t  N,
float64_t p,
float64_t q,
int32_t  num_trans,
float64_t a_trans 
)

Definition at line 242 of file HMM.cpp.

CHMM ( FILE *  model_file,
float64_t  PSEUDO 
)

Constructor - Initialization from model file.

Parameters:
model_file Filehandle to a hmm model file (*.mod)
PSEUDO Pseudo Value

Definition at line 354 of file HMM.cpp.

CHMM ( CHMM h  ) 

Constructor - Clone model h.

Definition at line 151 of file HMM.cpp.

~CHMM (  )  [virtual]

Destructor - Cleanup.

Definition at line 362 of file HMM.cpp.


Member Function Documentation

void add_states ( int32_t  num_states,
float64_t  default_val = 0 
)

increases the number of states by num_states the new a/b/p/q values are given the value default_val where 0<=default_val<=1

Definition at line 5015 of file HMM.cpp.

bool alloc_state_dependend_arrays (  ) 

allocates memory that depends on N

Definition at line 458 of file HMM.cpp.

bool append_model ( CHMM append_model,
float64_t cur_out,
float64_t app_out 
)

appends the append_model to the current hmm, i.e. two extra states are created. one is the end state of the current hmm with outputs cur_out (of size M) and the other state is the start state of the append_model. transition probability from state 1 to states 1 is 1

Definition at line 4907 of file HMM.cpp.

bool append_model ( CHMM append_model  ) 

appends the append_model to the current hmm, here no extra states are created. former q_i are multiplied by q_ji to give the a_ij from the current hmm to the append_model

Definition at line 4815 of file HMM.cpp.

float64_t backward ( int32_t  time,
int32_t  state,
int32_t  dimension 
) [protected]

inline proxies for backward pass

Definition at line 1556 of file HMM.h.

float64_t backward_comp ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

backward algorithm. calculates Pr[O_t+1,O_t+2, ..., O_T-1| q_time=S_i, lambda] for 0<= time <= T-1 Pr[O|lambda] for time >= T

Parameters:
time t
state i
dimension dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 875 of file HMM.cpp.

float64_t backward_comp_old ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

Definition at line 974 of file HMM.cpp.

bool baum_welch_viterbi_train ( BaumWelchViterbiType  type  ) 

interface for e.g. GUIHMM to run BaumWelch or Viterbi training

Parameters:
type type of BaumWelch/Viterbi training

Definition at line 5532 of file HMM.cpp.

float64_t best_path ( int32_t  dimension  ) 

calculates probability of best state sequence s_0,...,s_T-1 AND path itself using viterbi algorithm. The path can be found in the array PATH(dimension)[0..T-1] afterwards

Parameters:
dimension dimension of observation for which the most probable path is calculated (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 1106 of file HMM.cpp.

bool check_model_derivatives (  ) 

numerically check whether derivates were calculated right

Definition at line 4572 of file HMM.cpp.

bool check_model_derivatives_combined (  ) 

Definition at line 4502 of file HMM.cpp.

void chop ( float64_t  value  ) 

set any model parameter with probability smaller than value to ZERO

Definition at line 5075 of file HMM.cpp.

void clear_model (  ) 

initializes model with log(PSEUDO)

Definition at line 2614 of file HMM.cpp.

void clear_model_defined (  ) 

initializes only parameters in learn_x with log(PSEUDO)

Definition at line 2630 of file HMM.cpp.

void close_bracket ( FILE *  file  )  [protected]

expect closing bracket

Definition at line 2777 of file HMM.cpp.

bool comma_or_space ( FILE *  file  )  [protected]

expect comma or space.

Definition at line 2790 of file HMM.cpp.

void convert_to_log (  ) 

convert model to log probabilities

Definition at line 2347 of file HMM.cpp.

void copy_model ( CHMM l  ) 

copies the the modelparameters from l

Definition at line 2653 of file HMM.cpp.

void error ( int32_t  p_line,
const char *  str 
) [protected]

parse error messages

Definition at line 1501 of file HMM.h.

void estimate_model_baum_welch ( CHMM train  ) 

uses baum-welch-algorithm to train a fully connected HMM.

Parameters:
train model from which the new model is estimated

Definition at line 1482 of file HMM.cpp.

void estimate_model_baum_welch_defined ( CHMM train  ) 

uses baum-welch-algorithm to train the defined transitions etc.

Parameters:
train model from which the new model is estimated

Definition at line 1723 of file HMM.cpp.

void estimate_model_baum_welch_old ( CHMM train  ) 

Definition at line 1568 of file HMM.cpp.

void estimate_model_baum_welch_trans ( CHMM train  ) 

Definition at line 1653 of file HMM.cpp.

void estimate_model_viterbi ( CHMM train  ) 

uses viterbi training to train a fully connected HMM

Parameters:
train model from which the new model is estimated

Definition at line 1899 of file HMM.cpp.

void estimate_model_viterbi_defined ( CHMM train  ) 

uses viterbi training to train the defined transitions etc.

Parameters:
train model from which the new model is estimated

Definition at line 2026 of file HMM.cpp.

float64_t forward ( int32_t  time,
int32_t  state,
int32_t  dimension 
) [protected]

inline proxies for forward pass

Definition at line 1539 of file HMM.h.

float64_t forward_comp ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

forward algorithm. calculates Pr[O_0,O_1, ..., O_t, q_time=S_i| lambda] for 0<= time <= T-1 Pr[O|lambda] for time > T

Parameters:
time t
state i
dimension dimension of observation (observations are a matrix, where a row stands for one dimension i.e. 0_0,O_1,...,O_{T-1}

Definition at line 639 of file HMM.cpp.

float64_t forward_comp_old ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

Definition at line 743 of file HMM.cpp.

void free_state_dependend_arrays (  ) 

free memory that depends on N

Definition at line 515 of file HMM.cpp.

float64_t get_A ( T_STATES  line_,
T_STATES  column 
) const

access function for matrix A

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
Returns:
value at position line colum

Definition at line 1111 of file HMM.h.

float64_t get_a ( T_STATES  line_,
T_STATES  column 
) const

access function for matrix a

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
Returns:
value at position line colum

Definition at line 1125 of file HMM.h.

float64_t get_B ( T_STATES  line_,
uint16_t  column 
) const

access function for matrix B

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
Returns:
value at position line colum

Definition at line 1139 of file HMM.h.

float64_t get_b ( T_STATES  line_,
uint16_t  column 
) const

access function for matrix b

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
Returns:
value at position line colum

Definition at line 1153 of file HMM.h.

uint16_t get_best_path_state ( int32_t  dim,
int32_t  t 
)

Definition at line 559 of file HMM.h.

float64_t get_epsilon (  ) 

Definition at line 624 of file HMM.h.

int32_t get_iterations (  ) 

Definition at line 622 of file HMM.h.

float64_t get_log_derivative ( int32_t  num_param,
int32_t  num_example 
) [virtual]

get partial derivative of likelihood function (logarithmic)

abstract base method

Parameters:
num_param derivative against which param
num_example which example
Returns:
derivative of likelihood (logarithmic)

Implements CDistribution.

Definition at line 5465 of file HMM.cpp.

virtual float64_t get_log_likelihood_example ( int32_t  num_example  )  [virtual]

compute log likelihood for example

abstract base method

Parameters:
num_example which example
Returns:
log likelihood for example

Implements CDistribution.

Definition at line 509 of file HMM.h.

float64_t get_log_model_parameter ( int32_t  num_param  )  [virtual]

get model parameter (logarithmic)

abstrac base method

Returns:
model parameter (logarithmic)

Implements CDistribution.

Definition at line 5490 of file HMM.cpp.

int32_t get_M (  )  const

access function for number of observations M

Definition at line 980 of file HMM.h.

T_STATES get_N (  )  const

access function for number of states N

Definition at line 977 of file HMM.h.

virtual const char* get_name ( void   )  const [virtual]
Returns:
object name

Implements CSGObject.

Definition at line 1182 of file HMM.h.

virtual int32_t get_num_model_parameters (  )  [virtual]

get number of parameters in model

abstract base method

Returns:
number of parameters in model

Implements CDistribution.

Definition at line 506 of file HMM.h.

bool get_numbuffer ( FILE *  file,
char *  buffer,
int32_t  length 
) [protected]

put a sequence of numbers into the buffer

Definition at line 2817 of file HMM.cpp.

CStringFeatures<uint16_t>* get_observations (  ) 

return observation pointer

Definition at line 795 of file HMM.h.

float64_t get_p ( T_STATES  offset  )  const

access function for probability of initial states

Parameters:
offset index 0...N-1
Returns:
value at offset

Definition at line 1097 of file HMM.h.

T_STATES * get_path ( int32_t  dim,
float64_t prob 
)

get viterbi path and path probability

Parameters:
dim dimension for which to obtain best path
prob likelihood of path
Returns:
viterbi path

Definition at line 4025 of file HMM.cpp.

float64_t get_pseudo (  )  const

returns current pseudo value

Definition at line 748 of file HMM.h.

T_STATES get_psi ( int32_t  time,
T_STATES  state,
int32_t  dimension 
) const

access function for backtracking table psi

Parameters:
time time 0...T-1
state state 0...N-1
dimension dimension of observations 0...DIMENSION-1
Returns:
state at specified time and position

Definition at line 1169 of file HMM.h.

float64_t get_q ( T_STATES  offset  )  const

access function for probability of end states

Parameters:
offset index 0...N-1
Returns:
value at offset

Definition at line 1084 of file HMM.h.

bool get_status (  )  const

get status

Returns:
true if everything is ok, else false

Definition at line 742 of file HMM.h.

void init_model_defined (  ) 

init model according to const_x, learn_x. first model is initialized with 0 for all parameters then parameters in learn_x are initialized with random values finally const_x parameters are set and model is normalized.

Definition at line 2460 of file HMM.cpp.

void init_model_random (  ) 

init model with random values

Definition at line 2394 of file HMM.cpp.

bool initialize ( Model model,
float64_t  PSEUDO,
FILE *  model_file = NULL 
)

initialization function - gets called by constructors.

Parameters:
model model which holds definitions of states to be learned + consts
PSEUDO Pseudo Value
model_file Filehandle to a hmm model file (*.mod)

Definition at line 550 of file HMM.cpp.

void invalidate_model (  ) 

invalidates all caches. this function has to be called when direct changes to the model have been made. this is necessary for the forward/backward/viterbi algorithms to not work with old tables

Definition at line 2669 of file HMM.cpp.

float64_t linear_model_derivative ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes log dp(lambda)/d b_ij for linear model

Definition at line 1389 of file HMM.h.

float64_t linear_model_probability ( int32_t  dimension  ) 

calculates likelihood for linear model on observations in MEMORY

Parameters:
dimension dimension for which probability is calculated
Returns:
model probability

Definition at line 589 of file HMM.h.

bool linear_train ( bool  right_align = false  ) 

estimates linear model from observations.

Definition at line 5103 of file HMM.cpp.

bool load_definitions ( FILE *  file,
bool  verbose,
bool  initialize = true 
)

read definitions file (learn_x,const_x) used for training. -format specs: definition_file (train.def) % HMM-TRAIN - specification % learn_a - elements in state_transition_matrix to be learned % learn_b - elements in oberservation_per_state_matrix to be learned % note: each line stands for % state, observation(0), observation(1)...observation(NOW) % learn_p - elements in initial distribution to be learned % learn_q - elements in the end-state distribution to be learned % % const_x - specifies initial values of elements % rest is assumed to be 0.0 % % NOTE: IMPLICIT DEFINES: % define A 0 % define C 1 % define G 2 % define T 3

learn_a=[ [int32_t,int32_t]; [int32_t,int32_t]; [int32_t,int32_t]; ........ [int32_t,int32_t]; [-1,-1]; ];

learn_b=[ [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; [int32_t,int32_t,int32_t,...,int32_t]; ........ [int32_t,int32_t,int32_t,...,int32_t]; [-1,-1]; ];

learn_p= [ int32_t, ... , int32_t, -1 ];

learn_q= [ int32_t, ... , int32_t, -1 ];

const_a=[ [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; [int32_t,int32_t,float64_t]; ........ [int32_t,int32_t,float64_t]; [-1,-1,-1]; ];

const_b=[ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [int32_t,int32_t,int32_t,...,int32_t,<DOUBLE]; ........ [int32_t,int32_t,int32_t,...,int32_t,float64_t]; [-1,-1,-1]; ];

const_p[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ]; const_q[]=[ [int32_t, float64_t], ... , [int32_t,float64_t], [-1,-1] ];

Parameters:
file filehandle to definitions file
verbose true for verbose messages
initialize true to initialize to underlying HMM

Definition at line 3224 of file HMM.cpp.

bool load_model ( FILE *  file  ) 

read model from file. -format specs: model_file (model.hmm) % HMM - specification % N - number of states % M - number of observation_tokens % a is state_transition_matrix % size(a)= [N,N] % % b is observation_per_state_matrix % size(b)= [N,M] % % p is initial distribution % size(p)= [1, N]

N=int32_t; M=int32_t;

p=[float64_t,float64_t...float64_t]; q=[float64_t,float64_t...float64_t];

a=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

b=[ [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; [float64_t,float64_t...float64_t]; ];

Parameters:
file filehandle to model file

Definition at line 2926 of file HMM.cpp.

float64_t model_derivative_a ( T_STATES  i,
T_STATES  j,
int32_t  dimension 
)

computes log dp(lambda)/d a_ij.

Definition at line 1420 of file HMM.h.

float64_t model_derivative_b ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes log dp(lambda)/d b_ij.

Definition at line 1431 of file HMM.h.

float64_t model_derivative_p ( T_STATES  i,
int32_t  dimension 
)

computes log dp(lambda)/d p_i. backward path downto time 0 multiplied by observing first symbol in path at state i

Definition at line 1406 of file HMM.h.

float64_t model_derivative_q ( T_STATES  i,
int32_t  dimension 
)

computes log dp(lambda)/d q_i. forward path upto time T-1

Definition at line 1414 of file HMM.h.

float64_t model_probability ( int32_t  dimension = -1  ) 

inline proxy for model probability.

Definition at line 570 of file HMM.h.

float64_t model_probability_comp (  ) 

calculates probability that observations were generated by the model using forward algorithm.

Definition at line 1234 of file HMM.cpp.

void normalize ( bool  keep_dead_states = false  ) 

normalize the model to satisfy stochasticity

Definition at line 4780 of file HMM.cpp.

void open_bracket ( FILE *  file  )  [protected]

expect open bracket.

Definition at line 2756 of file HMM.cpp.

void output_model ( bool  verbose = false  ) 

prints the model parameters on screen.

Parameters:
verbose when false only the model probability will be printed when true the whole model will be printed additionally

Definition at line 2208 of file HMM.cpp.

void output_model_defined ( bool  verbose = false  ) 

performs output_model only for the defined transitions etc

Definition at line 2292 of file HMM.cpp.

float64_t path_derivative_a ( T_STATES  i,
T_STATES  j,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d a_ij

Definition at line 1467 of file HMM.h.

float64_t path_derivative_b ( T_STATES  i,
uint16_t  j,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d b_ij

Definition at line 1474 of file HMM.h.

float64_t path_derivative_p ( T_STATES  i,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d p_i

Definition at line 1453 of file HMM.h.

float64_t path_derivative_q ( T_STATES  i,
int32_t  dimension 
)

computes d log p(lambda,best_path)/d q_i

Definition at line 1460 of file HMM.h.

bool permutation_entropy ( int32_t  window_width,
int32_t  sequence_number 
)

compute permutation entropy

Definition at line 5407 of file HMM.cpp.

void prepare_path_derivative ( int32_t  dim  )  [protected]

initialization function that is called before path_derivatives are calculated

Definition at line 1511 of file HMM.h.

bool save_likelihood ( FILE *  file  ) 

save model probability in ascii format

Parameters:
file filehandle

Definition at line 4079 of file HMM.cpp.

bool save_likelihood_bin ( FILE *  file  ) 

save model probability in binary format

Parameters:
file filehandle

Definition at line 4062 of file HMM.cpp.

bool save_model ( FILE *  file  ) 

save model to file.

Parameters:
file filehandle to model file

Definition at line 3929 of file HMM.cpp.

bool save_model_bin ( FILE *  file  ) 

save model in binary format.

Parameters:
file filehandle

Definition at line 4100 of file HMM.cpp.

bool save_model_derivatives ( FILE *  file  ) 

save model derivatives to file in ascii format.

Parameters:
file filehandle

Definition at line 4454 of file HMM.cpp.

bool save_model_derivatives_bin ( FILE *  file  ) 

save model derivatives to file in binary format.

Parameters:
file filehandle

Definition at line 4333 of file HMM.cpp.

bool save_path ( FILE *  file  ) 

save viterbi path in ascii format

Parameters:
file filehandle

Definition at line 4038 of file HMM.cpp.

bool save_path_derivatives ( FILE *  file  ) 

save viterbi path in ascii format

Parameters:
file filehandle

Definition at line 4202 of file HMM.cpp.

bool save_path_derivatives_bin ( FILE *  file  ) 

save viterbi path in binary format

Parameters:
file filehandle

Definition at line 4250 of file HMM.cpp.

void set_a ( T_STATES  line_,
T_STATES  column,
float64_t  value 
)

access function for matrix a

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
value value to be set

Definition at line 1027 of file HMM.h.

void set_A ( T_STATES  line_,
T_STATES  column,
float64_t  value 
)

access function for matrix A

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...N-1
value value to be set

Definition at line 1013 of file HMM.h.

void set_B ( T_STATES  line_,
uint16_t  column,
float64_t  value 
)

access function for matrix B

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
value value to be set

Definition at line 1041 of file HMM.h.

void set_b ( T_STATES  line_,
uint16_t  column,
float64_t  value 
)

access function for matrix b

Parameters:
line_ row in matrix 0...N-1
column column in matrix 0...M-1
value value to be set

Definition at line 1055 of file HMM.h.

bool set_epsilon ( float64_t  eps  ) 

Definition at line 623 of file HMM.h.

bool set_iterations ( int32_t  num  ) 

Definition at line 621 of file HMM.h.

void set_observation_nocache ( CStringFeatures< uint16_t > *  obs  ) 

set new observations only set the observation pointer and drop caches if there were any

Definition at line 5220 of file HMM.cpp.

void set_observations ( CStringFeatures< uint16_t > *  obs,
CHMM hmm = NULL 
)

observation functions set/get observation matrix set new observations sets the observation pointer and initializes observation-dependent caches if hmm is given, then the caches of the model hmm are used

Definition at line 5262 of file HMM.cpp.

void set_p ( T_STATES  offset,
float64_t  value 
)

access function for probability of first state

Parameters:
offset index 0...N-1
value value to be set

Definition at line 999 of file HMM.h.

void set_pseudo ( float64_t  pseudo  ) 

sets current pseudo value

Definition at line 754 of file HMM.h.

void set_psi ( int32_t  time,
T_STATES  state,
T_STATES  value,
int32_t  dimension 
)

access function for backtracking table psi

Parameters:
time time 0...T-1
state state 0...N-1
value value to be set
dimension dimension of observations 0...DIMENSION-1

Definition at line 1070 of file HMM.h.

void set_q ( T_STATES  offset,
float64_t  value 
)

access function for probability of end states

Parameters:
offset index 0...N-1
value value to be set

Definition at line 986 of file HMM.h.

float64_t state_probability ( int32_t  time,
int32_t  state,
int32_t  dimension 
)

calculates probability of being in state i at time t for dimension

Definition at line 1365 of file HMM.h.

bool train ( CFeatures data = NULL  )  [virtual]

learn distribution

Parameters:
data training data (parameter can be avoided if distance or kernel-based classifiers are used and distance/kernels are initialized with train data)
Returns:
whether training was successful

Implements CDistribution.

Definition at line 444 of file HMM.cpp.

float64_t transition_probability ( int32_t  time,
int32_t  state_i,
int32_t  state_j,
int32_t  dimension 
)

calculates probability of being in state i at time t and state j at time t+1 for dimension

Definition at line 1372 of file HMM.h.


Member Data Documentation

float64_t all_pat_prob [protected]

probability of best path

Definition at line 1234 of file HMM.h.

bool all_path_prob_updated [protected]

true if path probability is up to date

Definition at line 1246 of file HMM.h.

T_ALPHA_BETA alpha_cache [protected]

cache for forward variables can be terrible HUGE O(T*N)

Definition at line 1307 of file HMM.h.

float64_t* arrayN1 [protected]

array of size N for temporary calculations

Definition at line 1271 of file HMM.h.

float64_t* arrayN2 [protected]

array of size N for temporary calculations

Definition at line 1273 of file HMM.h.

T_ALPHA_BETA beta_cache [protected]

cache for backward variables can be terrible HUGE O(T*N)

Definition at line 1309 of file HMM.h.

int32_t conv_it [protected]

Definition at line 1231 of file HMM.h.

distribution of end-states

Definition at line 1220 of file HMM.h.

float64_t epsilon [protected]

convergence criterion epsilon

Definition at line 1230 of file HMM.h.

const int32_t GOTa = (1<<4) [static, protected]

GOTa

Definition at line 1333 of file HMM.h.

const int32_t GOTb = (1<<5) [static, protected]

GOTb

Definition at line 1335 of file HMM.h.

const int32_t GOTconst_a = (1<<5) [static, protected]

GOTconst_a

Definition at line 1350 of file HMM.h.

const int32_t GOTconst_b = (1<<6) [static, protected]

GOTconst_b

Definition at line 1352 of file HMM.h.

const int32_t GOTconst_p = (1<<7) [static, protected]

GOTconst_p

Definition at line 1354 of file HMM.h.

const int32_t GOTconst_q = (1<<8) [static, protected]

GOTconst_q

Definition at line 1356 of file HMM.h.

const int32_t GOTlearn_a = (1<<1) [static, protected]

GOTlearn_a

Definition at line 1342 of file HMM.h.

const int32_t GOTlearn_b = (1<<2) [static, protected]

GOTlearn_b

Definition at line 1344 of file HMM.h.

const int32_t GOTlearn_p = (1<<3) [static, protected]

GOTlearn_p

Definition at line 1346 of file HMM.h.

const int32_t GOTlearn_q = (1<<4) [static, protected]

GOTlearn_q

Definition at line 1348 of file HMM.h.

const int32_t GOTM = (1<<2) [static, protected]

GOTM

Definition at line 1329 of file HMM.h.

const int32_t GOTN = (1<<1) [static, protected]

GOTN

Definition at line 1327 of file HMM.h.

const int32_t GOTO = (1<<3) [static, protected]

GOTO

Definition at line 1331 of file HMM.h.

const int32_t GOTp = (1<<6) [static, protected]

GOTp

Definition at line 1337 of file HMM.h.

const int32_t GOTq = (1<<7) [static, protected]

GOTq

Definition at line 1339 of file HMM.h.

initial distribution of states

Definition at line 1217 of file HMM.h.

int32_t iteration_count [protected]

Definition at line 1227 of file HMM.h.

int32_t iterations [protected]

convergence criterion iterations

Definition at line 1226 of file HMM.h.

int32_t line [protected]

Definition at line 1199 of file HMM.h.

bool loglikelihood [protected]

Definition at line 1255 of file HMM.h.

int32_t M [protected]

number of observation symbols eg. ACGT -> 0123

Definition at line 1190 of file HMM.h.

float64_t mod_prob [protected]

probability of model

Definition at line 1240 of file HMM.h.

bool mod_prob_updated [protected]

true if model probability is up to date

Definition at line 1243 of file HMM.h.

Model* model [protected]

Definition at line 1205 of file HMM.h.

int32_t N [protected]

number of states

Definition at line 1193 of file HMM.h.

matrix of absolute counts of observations within each state

Definition at line 1211 of file HMM.h.

distribution of observations within each state

Definition at line 1223 of file HMM.h.

CStringFeatures<uint16_t>* p_observations [protected]

observation matrix

Definition at line 1202 of file HMM.h.

float64_t pat_prob [protected]

probability of best path

Definition at line 1237 of file HMM.h.

T_STATES* path [protected]

best path (=state sequence) through model

Definition at line 1315 of file HMM.h.

int32_t path_deriv_dimension [protected]

dimension for which path_deriv was calculated

Definition at line 1249 of file HMM.h.

bool path_deriv_updated [protected]

true if path derivative is up to date

Definition at line 1252 of file HMM.h.

int32_t path_prob_dimension [protected]

dimension for which path_prob was calculated

Definition at line 1321 of file HMM.h.

bool path_prob_updated [protected]

true if path probability is up to date

Definition at line 1318 of file HMM.h.

float64_t PSEUDO [protected]

define pseudocounts against overfitting

Definition at line 1196 of file HMM.h.

bool reused_caches [protected]

Definition at line 1261 of file HMM.h.

backtracking table for viterbi can be terrible HUGE O(T*N)

Definition at line 1312 of file HMM.h.

bool status [protected]

Definition at line 1258 of file HMM.h.

transition matrix

Definition at line 1214 of file HMM.h.

matrix of absolute counts of transitions

Definition at line 1208 of file HMM.h.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines

SHOGUN Machine Learning Toolbox - Documentation