en/current/ModularTutorial_8mainpage_source.html

/*!

\page modular_tutorial Tutorial for Modular Interfaces


SHOGUN's "modular" interfaces to Python, Octave, Java, Lua, Ruby and C# give

intuitive and easy access to the shogun's functionality. Compared to the static

interfaces (\subpage staticinterfaces), the modular ones are much more flexible

and allow for nearly unlimited extensibility.


If this is your first time using shogun, you've found the right place to start!


In this tutorial, we demonstrate how to use shogun to create a simple Gaussian

kernel based Support Vector Machine (SVM) classifier, but first things first.

Let's fire up python, octave, java, lua, ruby or C#, and load the modular shogun

environment.


\section start_shogun_modular Starting SHOGUN


To load all of shogun's modules under octave start octave and issue

\verbatim

init_shogun

\endverbatim


Under python we have to specify what we'd like to import.

For this example, we will need features and labels to represent our input data,

a classifier, a kernel to compare the similarity of features and

some evaluation procedures.


\verbatim

from shogun.Features import *

from shogun.Kernel import *

from shogun.Classifier import *

from shogun.Evaluation import *

\endverbatim


Under java we need import shogun package and jblas package.


\verbatim

import org.shogun.*;

import org.jblas.*;

\endverbatim


Under lua We need set LUA_PATH and LUA_CPATH to find the *.lua lib and *.so lib

that lua_modular requires. For instance, if current dir is examples/undocumented

/lua_modular, we need set the path as the following when we want to test the

examples:


\verbatim

export LUA_PATH=../../../src/interfaces/lua_modular/?.lua\;?.lua

export LUA_CPATH=../../../src/interfaces/lua_modular/?.so

\endverbatim


Under ruby we need import shogun package and narray package.


\verbatim

require 'modshogun'

require 'narray'

\endverbatim


Under c# we need import system package.


\verbatim

using System;

\endverbatim


we need import shogun package, which includes all the sub package, such

as kernel, classifier, distance and so on.


\verbatim

require('shogun')

\endverbatim


we also have to specify what we'd like to load.

For this example, we will need features and labels to represent our input data,

a classifier, a kernel to compare the similarity of features


\verbatim

System.loadLibrary("Features");

System.loadLibrary("Classifier");

System.loadLibrary("Kernel");

\endverbatim


We also need use the method of Features to init shogun as the following:

\verbatim

Features.init_shogun_with_defaults();

\endverbatim


(Note here that the module shogun.Features contains the class Labels).


\section docu_shogun_modular Getting Help


If you would like to get an overview of the classes in shogun or

if you are looking for the documentation of a particular class,

use the "Classes" tab above and browse through the class list.

All classes are rather well documented (if documentation is not good enough in a

particular class, please notify us). Note that all classes in the

classes list (classes tab above) are all prefixed with a 'C' - that

really is the only difference to using them from within the python or

octave modular interfaces (where the C prefix is missing).


Alternatively, under python the same documentation is available via

python help strings, so you may issue


\verbatim

help(<classname>)

\endverbatim


for example

\verbatim

from shogun.Kernel import GaussianKernel

help(GaussianKernel)

\endverbatim


or


\verbatim

import shogun.Kernel

help(shogun.Kernel)

\endverbatim


\section toy_tutorial_modular Generating a toy dataset


To start with, we will generate a small toy dataset. In python, we need to

import numpy to do this. We will generate real valued training (and testing)

data drawn from Gaussians distribution. We will generate two Gauss bumps

that are "dist" apart. The data is in matrix shape with each column

describing an object and as many columns as we have examples. Additionally,

we need a vector of labels (that is a vector of ones or minus ones) that

indicates the class of each example.


\verbatim

from numpy import *

from numpy.random import randn

dist=0.5

traindata_real = concatenate((randn(2,100)-dist, randn(2,100)+dist), axis=1)

testdata_real = concatenate((randn(2,100)-dist, randn(2,100)+dist), axis=1)

train_labels = concatenate((-ones(100), ones(100)))

test_labels = concatenate((-ones(100), ones(100)))

\endverbatim


In octave, this is is done as follows:


\verbatim

dist=0.5

traindata_real = [randn(2,100)-dist, randn(2,100)+dist];

testdata_real = [randn(2,100)-dist, randn(2,100)+dist];

train_labels = [-ones(1,100), ones(1,100)];

test_labels = [-ones(1,100), ones(1,100)];

\endverbatim


In java, this is is done as follows:

\verbatim

dist=0.5

int num = 1000;

double dist = 1.0;


DoubleMatrix offs=ones(2, num).mmul(dist);

DoubleMatrix x = randn(2, num).sub(offs);

DoubleMatrix y = randn(2, num).add(offs);

DoubleMatrix traindata_real = concatHorizontally(x, y);


DoubleMatrix m = randn(2, num).sub(offs);

DoubleMatrix n = randn(2, num).add(offs);

DoubleMatrix testdata_real = concatHorizontally(m, n);


DoubleMatrix o = ones(1,num);

DoubleMatrix trainlab = concatHorizontally(o.neg(), o);

DoubleMatrix testlab = concatHorizontarily(o.neg(), o)

\endverbatim


In lua, this is is done as follows:

\verbatim

require 'shogun'

require 'load'


function concatenate(...)

   local result = ...

   for _,t in ipairs{select(2, ...)} do

       for row,rowdata in ipairs(t) do

           for col,coldata in ipairs(rowdata) do

               table.insert(result[row], coldata)

           end

       end

   end

   return result

end


function rand_matrix(rows, cols, dist)

  local matrix = {}

   for i = 1, rows do

       matrix[i] = {}

       for j = 1, cols do

           matrix[i][j] = math.random() + dist

       end

   end

   return matrix

end


function ones(num)

   r={}

   for i=1,num do

       r[i]=1

   end

   return r

end


num=1000

dist=1

width=2.1

C=1


traindata_real=concatenate(rand_matrix(2,num, -dist),rand_matrix(2,num,dist))

testdata_real=concatenate(rand_matrix(2,num,-dist), rand_matrix(2,num, dist))

trainlab={}

for i = 1, num do

   trainlab[i] = -1

   trainlab[i + num] = 1

end


testlab={}

for i = 1, num do

   testlab[i] = -1

   testlab[i + num] = 1

end


\endverbatim


In ruby, this is is done as follows:

\verbatim

@num = 1000

@dist = 1

@width = 2.1

C = 1


def gen_rand_ary

  ary = [[],[]]

  ary.each do |p|

    p << ary_fill( @dist ) + ary_fill( -@dist )

    p.flatten!

  end

  return ary

end


def gen_ones_vec

  ary = []

  @num.times do

    ary << -1

  end

  @num.times do

    ary << 1

  end

  return ary

end


puts "generating training data"

traindata_real = gen_rand_ary

testdata_real = gen_rand_ary


puts "generating labels"

trainlab = gen_ones_vec

testlab = gen_ones_vec

\endverbatim


In C#, this is is done as follows:

int num = 1000;

double dist = 1.0;

double width = 2.1;

double C = 1.0;


Random RandomNumber = new Random();


double[,] traindata_real = new double[2, num * 2];

for (int i = 0; i < num; i ++) {

   traindata_real[0, i] = RandomNumber.NextDouble() - dist;

   traindata_real[0, i + num] = RandomNumber.NextDouble() + dist;

   traindata_real[1, i] = RandomNumber.NextDouble() - dist;

   traindata_real[1, i + num] = RandomNumber.NextDouble() + dist;

}


double[,] testdata_real = new double[2, num * 2];

for (int i = 0; i < num; i ++) {

   testdata_real[0, i] = RandomNumber.NextDouble() - dist;

   testdata_real[0, i + num] = RandomNumber.NextDouble() + dist;

   testdata_real[1, i] = RandomNumber.NextDouble() - dist;

   testdata_real[1, i + num] = RandomNumber.NextDouble() + dist;

}


double[] trainlab = new double[num * 2];

for (int i = 0; i < num; i ++) {

   trainlab[i] = -1;

   trainlab[i + num] = 1;

}


double[] testlab = new double[num * 2];

for (int i = 0; i < num; i ++) {

   testlab[i] = -1;

   testlab[i + num] = 1;

}


The rest of this tutorial below will now work the same (identical syntax) for

python, octave (when using a trailing semicolon for each command, which is

optional in python), lua(we use colon to get the method of an object), ruby and

C#


For java, we use DoubleMatrix to represent most of the types. If shogun C++

class need a int Vector/Matrix, we convert the DoubleMatrix into int

Vector/Matrix and when returned from the function, we convert the int

Vector/Matrix into DoubleMatrix again.


\section svm_tutorial_modular Creating an SVM classifier


To process the above toy data in shogun, we need to create a shogun feature

object, here RealFeatures (for dense real valued feature matrices, see also

shogun::CSimpleFeatures) like this


\verbatim

feats_train = RealFeatures(traindata_real);

feats_test = RealFeatures(testdata_real);

\endverbatim


Using the above feature object we can now create a kernel object.  Here, we

create a Gaussian kernel of a certain width (see also shogun::CGaussianKernel)

based on our training features


\verbatim

width = 2;

kernel = GaussianKernel(feats_train, feats_train, width);

\endverbatim


and can now compute the kernel matrix

\verbatim

km = kernel.get_kernel_matrix();

\endverbatim


To train an SVM, we need labeled examples, which is a vector of ones and minus

ones, such as the one we have previously stored in the variable train_labels. We

now create a shogun label object from it (the same goes for our test labels,

which we'll use later):


\verbatim

labels = Labels(train_labels);

labels_test = Labels(test_labels);

\endverbatim


Given the labels object and the kernel all that is left to do is to specify a

cost parameter C (used to control generalization performance) and we can

construct an SVM object. To start the training, we simply invoke the train

method of the SVM object. Quiet easy, isn't it?


\verbatim

C = 1.0;

svm = LibSVM(C, kernel, labels);

svm.train();

\endverbatim


To apply the SVM to unseen test data, we simply need to pass a feature object to

the SVM's apply method, which returns a shogun Label object (note that we could

alternatively initialize the kernel object with the train and test data manually

and then call apply without arguments, which is done in some of the other

example scripts). If we would like to analyze the outputs directly in

python/octave, we can obtain the vector of outputs in native python/octave

representation via get_labels().


\verbatim

output = svm.apply(feats_test);

output_vector = output.get_labels();

\endverbatim


Given the output and the test labels, we can now assess the prediction

performance. For this, we create an instance of the class PerformanceMeasures,

which provides a convenient way of obtaining various performance measures, such

as accuracy (acc), area under the receiver operator characteristic curve

(auROC), F-score and others:


\verbatim

pm = PerformanceMeasures(labels_test, output);

acc = pm.get_accuracy();

roc = pm.get_auROC();

fms = pm.get_fmeasure();

\endverbatim


That's really it. For any of the advanced topics please have a look one of the

\b many self-explanatory examples in


\li examples/octave-modular also available online \subpage octave_modular_examples "here"

\li examples/python-modular also available online \subpage python_modular_examples "here"


A full, working example similar to the one presented above is shown below:


For octave:


\verbinclude classifier_libsvm_minimal_modular.m


For java:


\verbinclude classifier_libsvm_minimal_modular.java


For lua:


\verbinclude classifier_libsvm_minimal_modular.lua


For ruby:


\verbinclude classifier_libsvm_minimal_modular.rb


For csharp:


\verbinclude classifier_libsvm_minimal_modular.cs


For python:

\verbinclude classifier_libsvm_minimal_modular.py

*/