As mentioned before SHOGUN interfaces to several programming languages and toolkits such as Matlab(tm), R, Python, Octave. The following sections shall give you an overview over the static interface commands of SHOGUN. For the static interfaces we tried to preserve the syntax of the commands in a consistent manner through all the different languages. However as in some cases this was not possible and we document the subtle differences of syntax and semantic in the respective toolkit. Instead of reading through all this, we suggest to have a look at the large number of examples available in the examples / interface directory. For example examples/R or examples/python etc.

Overview of Static Interfaces & Testing the Installation

Interface Commands

Command Reference

Function Reference

Overview of Static Interfaces & Testing the Installation

Static Matlab and Octave Interface

Since octave is nowadays up to par with matlab a single documentation for both interfaces is sufficient and will be based on octave (matlab can be used synonymously).

To start SHOGUN in octave, start octave and check if it is correctly installed by by typing ( let ">" be the octave prompt )

  sg('help')

inside of octave. This should show you some help text.

Static Python Interface

To start SHOGUN in python, start python and check if it is correctly installed by by typing ( let ">" be the python prompt )

  from sg import sg
  sg('help')

inside of python. This should show you some help text.

Static R Interface

To fire up SHOGUN in R make sure that you have SHOGUN correctly installed in R. You can check this by typing ( let ">" be the R prompt ):

  > library()

inside of R, this command should list all R packages that have been installed on your system. You should have an entry like:

  sg                     The SHOGUN Machine Learning Toolbox

After you made sure that SHOGUN is installed correctly you can start it via:

  > library(sg)

you will see some informations of the SHOGUN core (compile options etc). After this command R and SHOGUN are ready to receive your commands.

In general all commands in SHOGUN are issued using the function sg(...). To invoke the SHOGUN command help one types:

  > sg('help')

and then a help text appears giving a short description of all commands.

Static Interface Commands

Features

These functions transfer data from the interface to shogun and back. Suppose you have a matlab matrix or R matrix "features" which contains your training data and you want to register this data, you simply type:

Transfer the features to shogun

set_features

sg('set_features', 'TRAIN|TEST', features[, DNABINFILE|<ALPHABET>])

add_features

sg('add_features', 'TRAIN|TEST', features[, DNABINFILE|<ALPHABET>])

Features can be char/byte/word/int/real valued matrices, real values sparse matrices, or strings (lists or cell arrays of strings). When dealing with strings an alphabet name has to be specified (DNA, RAW, ...). Use 'TRAIN' to tell SHOGUN that this is the data you want to train your classifier and TEST for the test data.

In contrast to set_features, add_features will create a combined feature object and append the features to it. This is useful when dealing with a set of different features (real valued and strings) and multiple kernels.

In case a single string was set using set_features, it can be "multiplexed" by sliding a window over it using

from_position_list

sg('from_position_list', 'TRAIN|TEST', winsize, shift[, skip])

obtain_from_sliding_window

sg('obtain_from_sliding_window', winsize, skip)

Deletes the features which we assigned before in the actual SHOGUN session.

clean_features
```
sg('clean_features') 
```

Obtain the Features from shogun

get_features

[features]=sg('get_features', 'TRAIN|TEST')

One proceeds similar when assigning labels to the training data and obtaining labels from shogun: The commands

set_labels
```
sg('set_labels', 'TRAIN', trainlab) 
```

get_labels

[labels]=sg('get_labels', 'TRAIN|TEST')

tell SHOGUN that the labels of the assigned training data reside in trainlab, respectively return the current labels (note that currently all data is copied into SHOGUN, so modifications to trainlab are local within the interface).

Kernel & Distances

Kernel and DistanceMatrix specific commands, used to create, obtain and setting the kernel matrix.

Creating a kernel in shogun

set_kernel

sg('set_kernel', 'KERNELNAME', 'FEATURETYPE', CACHESIZE, PARAMETERS)

add_kernel

sg('add_kernel', WEIGHT, 'KERNELNAME', 'FEATURETYPE', CACHESIZE, PARAMETERS)

Here KERNELNAME is the name of the kernel one wishes to use, FEATURETYPE the type of features (e.g. REAL for standard realvalued feature vectors), CACHESIZE the size of the kernel cache in megabytes and PARAMETERS kernel specific additional parameters.

Supported Kernels

The following kernels are implemented in SHOGUN:

AUC
Chi2
Spectrum
Const Kernel
User defined CustomKernel
Diagonal Kernel
Kernel from Distance
Fixed Degree StringKernel
Gaussian \( k(x,x')=e^{-\frac{||x-x'||^2}{\sigma}} \)

To work with a gaussian kernel on real values one issues:

sg('set_kernel', 'GAUSSIAN', 'TYPE', CACHESIZE, SIGMA)

For example:

sg('set_kernel', 'GAUSSIAN', 'REAL', 40, 1)

creates a gaussian kernel on real values with a cache size of 40MB and a sigma value of one. Available types for the gaussian kernel: REAL, SPARSEREAL.

Gaussian Shift Kernel
Histogram Kernel
Linear \(k(x,x')=x\cdot x'\)

A linear kernel is created via:

sg('set_kernel', 'LINEAR', 'TYPE', CACHESIZE)

For example:

sg('add_kernel', 1.0, 'LINEAR', 'REAL', 50')

creates a linear kernel of cache size 50 for real datavalues, with weight 1.0.

Available types for the linear kernel: BYTE, WORD CHAR, REAL, SPARSEREAL.

Local Alignment StringKernel
Locality Improved StringKernel
Polynomial Kernel \(k(x,x')=(x\cdot x')^d\)

A polynomial kernel is created via:

sg('set_kernel', 'POLY', 'TYPE', CACHESIZE, DEGREE, INHOMOGENE, NORMALIZE)

For example:

sg('add_kernel', 0.1, 'POLY', 'REAL', 50, 3, 0)

adds a polynomial kernel. Available types for the polynomial kernel: REAL, CHAR, SPARSEREAL.

Salzberg Kernel
Sigmoid Kernel To work with a sigmoid kernel on real values one issues:

sg('set_kernel', 'SIGMOID', 'TYPE', CACHESIZE, GAMMA, COEFF)

For example:

sg('set_kernel', 'SIGMOID', 'REAL', 40, 0.1, 0.1)

creates a sigmoid kernel on real values with a cache size of 40MB, a gamma value of 0.1 and a coefficient of 0.1. Available types for the gaussian kernel: REAL.

Weighted Spectrum Kernel
Weighted Degree Kernels
Match Kernel
Custom Kernel

Assign a user defined custom kernel, fo which only the upper triangle may be given (DIAG) or the FULL matrix (FULL), or the full matrix which is then internally stored as a upper triangle (FULL2DIAG).

set_custom_kernel

sg('set_custom_kernel', kernelmatrix, 'DIAG|FULL|FULL2DIAG')

The purpose of the get_kernel_matrix and get_distance_matrix commands is to return a kernel or distance matrix representing the kernel/distance matrix for the actual problem.

get_distance_matrix

[D]=sg('get_distance_matrix', 'TRAIN|TEST')

get_kernel_matrix

[K]=sg('get_kernel_matrix', 'TRAIN|TEST')

km refers to a matrix object.

SVM

new_classifier Creates a new classifier (e.g. SVM instance).
train_classifier Starts the training of the SVM on the assigned features and kernels.

The get_svm command returns some properties of an SVM such as the Langrange multipliers alpha, the bias b and the index of the support vectors SV (zero based).

get_classifier
```
[bias, alphas]=sg('get_svm') 
```
set_classifier
```
sg('set_classifier', bias, alphas) 
```

This commands returns a list of arguments. set_classifier may be later on used (after creating an SVM classifier) to set alphas and bias again.

The result of the classification of the test sample is obtained via:

classify
```
[result]=sg('classify') 
```
classify_example
```
[result]=sg('classify_example', feature_vector_index) 
```
where result is a vector containing the classification result for each datapoint and classify_example only obtains the output for a single example (index is zero based like in python. note that octave, matlab, R are 1 based).

HMM

get_hmm
set_hmm
hmm_classify
hmm_classify_example
hmm_likelihood
get_viterbi_path

POIM

compute_poim_wd
get_SPEC_consensus
get_SPEC_scoring
get_WD_consensus
get_WD_scoring

Utility

Miscellaneous functions.

Returns the svn version number

help
```
sg('get_version') 
```

Gives you a help text.

help
```
sg('help') 
```
help
```
sg('help', 'CMD') 
```

Sets a debugging log level - useful to trace errors.

loglevel
```
sg('loglevel', 'LEVEL') 
```
LEVEL can be one of DEBUG, WARN, ERROR
ALL: very verbose logging output (useful only for hunting memory leaks)
DEBUG: verbose logging output (useful for debugging).
WARN: less logging output (useful for error search).
ERROR: only logging output on critical errors.

For example

  > sg('loglevel', 'ALL')

gives you a list of instructions.

Let's get started, equipped with the above information on the basic SHOGUN commands you are now able to create your own SHOGUN applications.

Example

Let us discuss an example:

```
sg('set_features', 'TRAIN', traindat) 
```
registers the training sample which reside in traindat.

```
sg('set_labels', 'TRAIN', trainlab) 
```
registers the training labels.

```
sg('set_kernel', 'GAUSSIAN', 'REAL', 100, 1.0) 
```
creates a new gaussian kernel for reals with cache size 100Mb and width = 1.

```
sg('new_classifier', 'SVMLIGHT') 
```
creates a new SVM object inside the SHOGUN core.

```
sg('c', 20.0)  
```
sets the C value of the new SVM to 20.0.

```
sg('train_classifier') 
```
attaches the data to the kernel and does some initialization then starts the training on the sample.

```
sg('set_features', 'TEST', testdat) 
```
registers the test sample

```
out=sg('classify') 
```
attaches the data to the kernel and classifies. Then gives you the classification result as a vector.

Overview of Static Interfaces & Testing the Installation

Static Matlab and Octave Interface

Static Python Interface

Static R Interface

Static Interface Commands

Features

Kernel & Distances

Supported Kernels

SVM

HMM

POIM

Utility

Example

Function Reference