2 \page modular_tutorial Tutorial for Modular Interfaces
4 SHOGUN's "modular" interfaces to Python, Octave, Java, Lua, Ruby and C# give
5 intuitive and easy access to the shogun's functionality. Compared to the static
6 interfaces (\subpage staticinterfaces), the modular ones are much more flexible
7 and allow for nearly unlimited extensibility.
9 If this is your first time using shogun, you've found the right place to start!
12 In this tutorial, we demonstrate how to use shogun to create a simple Gaussian
13 kernel based Support Vector Machine (SVM) classifier, but first things first.
14 Let's fire up python, octave, java, lua, ruby or C#, and load the modular shogun
17 \section start_shogun_modular Starting SHOGUN
19 To load all of shogun's modules under octave start octave and issue
24 Under python we have to specify what we'd like to import.
25 For this example, we will need features and labels to represent our input data,
26 a classifier, a kernel to compare the similarity of features and
27 some evaluation procedures.
30 from shogun.Features import *
31 from shogun.Kernel import *
32 from shogun.Classifier import *
33 from shogun.Evaluation import *
36 Under java we need import shogun package and jblas package.
43 Under lua We need set LUA_PATH and LUA_CPATH to find the *.lua lib and *.so lib
44 that lua_modular requires. For instance, if current dir is examples/undocumented
45 /lua_modular, we need set the path as the following when we want to test the
49 export LUA_PATH=../../../src/interfaces/lua_modular/?.lua\;?.lua
50 export LUA_CPATH=../../../src/interfaces/lua_modular/?.so
53 Under ruby we need import shogun package and narray package.
60 Under c# we need import system package.
66 we need import shogun package, which includes all the sub package, such
67 as kernel, classifier, distance and so on.
73 we also have to specify what we'd like to load.
74 For this example, we will need features and labels to represent our input data,
75 a classifier, a kernel to compare the similarity of features
78 System.loadLibrary("Features");
79 System.loadLibrary("Classifier");
80 System.loadLibrary("Kernel");
83 We also need use the method of Features to init shogun as the following:
85 Features.init_shogun_with_defaults();
88 (Note here that the module shogun.Features contains the class Labels).
90 \section docu_shogun_modular Getting Help
92 If you would like to get an overview of the classes in shogun or
93 if you are looking for the documentation of a particular class,
94 use the "Classes" tab above and browse through the class list.
95 All classes are rather well documented (if documentation is not good enough in a
96 particular class, please notify us). Note that all classes in the
97 classes list (classes tab above) are all prefixed with a 'C' - that
98 really is the only difference to using them from within the python or
99 octave modular interfaces (where the C prefix is missing).
101 Alternatively, under python the same documentation is available via
102 python help strings, so you may issue
110 from shogun.Kernel import GaussianKernel
122 \section toy_tutorial_modular Generating a toy dataset
124 To start with, we will generate a small toy dataset. In python, we need to
125 import numpy to do this. We will generate real valued training (and testing)
126 data drawn from Gaussians distribution. We will generate two Gauss bumps
127 that are "dist" apart. The data is in matrix shape with each column
128 describing an object and as many columns as we have examples. Additionally,
129 we need a vector of labels (that is a vector of ones or minus ones) that
130 indicates the class of each example.
134 from numpy.random import randn
136 traindata_real = concatenate((randn(2,100)-dist, randn(2,100)+dist), axis=1)
137 testdata_real = concatenate((randn(2,100)-dist, randn(2,100)+dist), axis=1)
138 train_labels = concatenate((-ones(100), ones(100)))
139 test_labels = concatenate((-ones(100), ones(100)))
142 In octave, this is is done as follows:
146 traindata_real = [randn(2,100)-dist, randn(2,100)+dist];
147 testdata_real = [randn(2,100)-dist, randn(2,100)+dist];
148 train_labels = [-ones(1,100), ones(1,100)];
149 test_labels = [-ones(1,100), ones(1,100)];
152 In java, this is is done as follows:
158 DoubleMatrix offs=ones(2, num).mmul(dist);
159 DoubleMatrix x = randn(2, num).sub(offs);
160 DoubleMatrix y = randn(2, num).add(offs);
161 DoubleMatrix traindata_real = concatHorizontally(x, y);
163 DoubleMatrix m = randn(2, num).sub(offs);
164 DoubleMatrix n = randn(2, num).add(offs);
165 DoubleMatrix testdata_real = concatHorizontally(m, n);
167 DoubleMatrix o = ones(1,num);
168 DoubleMatrix trainlab = concatHorizontally(o.neg(), o);
169 DoubleMatrix testlab = concatHorizontarily(o.neg(), o)
172 In lua, this is is done as follows:
177 function concatenate(...)
179 for _,t in ipairs{select(2, ...)} do
180 for row,rowdata in ipairs(t) do
181 for col,coldata in ipairs(rowdata) do
182 table.insert(result[row], coldata)
189 function rand_matrix(rows, cols, dist)
194 matrix[i][j] = math.random() + dist
214 traindata_real=concatenate(rand_matrix(2,num, -dist),rand_matrix(2,num,dist))
215 testdata_real=concatenate(rand_matrix(2,num,-dist), rand_matrix(2,num, dist))
219 trainlab[i + num] = 1
230 In ruby, this is is done as follows:
240 p << ary_fill( @dist ) + ary_fill( -@dist )
257 puts "generating training data"
258 traindata_real = gen_rand_ary
259 testdata_real = gen_rand_ary
261 puts "generating labels"
262 trainlab = gen_ones_vec
263 testlab = gen_ones_vec
266 In C#, this is is done as follows:
272 Random RandomNumber = new Random();
274 double[,] traindata_real = new double[2, num * 2];
275 for (int i = 0; i < num; i ++) {
276 traindata_real[0, i] = RandomNumber.NextDouble() - dist;
277 traindata_real[0, i + num] = RandomNumber.NextDouble() + dist;
278 traindata_real[1, i] = RandomNumber.NextDouble() - dist;
279 traindata_real[1, i + num] = RandomNumber.NextDouble() + dist;
282 double[,] testdata_real = new double[2, num * 2];
283 for (int i = 0; i < num; i ++) {
284 testdata_real[0, i] = RandomNumber.NextDouble() - dist;
285 testdata_real[0, i + num] = RandomNumber.NextDouble() + dist;
286 testdata_real[1, i] = RandomNumber.NextDouble() - dist;
287 testdata_real[1, i + num] = RandomNumber.NextDouble() + dist;
290 double[] trainlab = new double[num * 2];
291 for (int i = 0; i < num; i ++) {
293 trainlab[i + num] = 1;
296 double[] testlab = new double[num * 2];
297 for (int i = 0; i < num; i ++) {
299 testlab[i + num] = 1;
302 The rest of this tutorial below will now work the same (identical syntax) for
303 python, octave (when using a trailing semicolon for each command, which is
304 optional in python), lua(we use colon to get the method of an object), ruby and
307 For java, we use DoubleMatrix to represent most of the types. If shogun C++
308 class need a int Vector/Matrix, we convert the DoubleMatrix into int
309 Vector/Matrix and when returned from the function, we convert the int
310 Vector/Matrix into DoubleMatrix again.
312 \section svm_tutorial_modular Creating an SVM classifier
315 To process the above toy data in shogun, we need to create a shogun feature
316 object, here RealFeatures (for dense real valued feature matrices, see also
317 shogun::CSimpleFeatures) like this
320 feats_train = RealFeatures(traindata_real);
321 feats_test = RealFeatures(testdata_real);
324 Using the above feature object we can now create a kernel object. Here, we
325 create a Gaussian kernel of a certain width (see also shogun::CGaussianKernel)
326 based on our training features
330 kernel = GaussianKernel(feats_train, feats_train, width);
333 and can now compute the kernel matrix
335 km = kernel.get_kernel_matrix();
338 To train an SVM, we need labeled examples, which is a vector of ones and minus
339 ones, such as the one we have previously stored in the variable train_labels. We
340 now create a shogun label object from it (the same goes for our test labels,
341 which we'll use later):
344 labels = Labels(train_labels);
345 labels_test = Labels(test_labels);
348 Given the labels object and the kernel all that is left to do is to specify a
349 cost parameter C (used to control generalization performance) and we can
350 construct an SVM object. To start the training, we simply invoke the train
351 method of the SVM object. Quiet easy, isn't it?
355 svm = LibSVM(C, kernel, labels);
359 To apply the SVM to unseen test data, we simply need to pass a feature object to
360 the SVM's apply method, which returns a shogun Label object (note that we could
361 alternatively initialize the kernel object with the train and test data manually
362 and then call apply without arguments, which is done in some of the other
363 example scripts). If we would like to analyze the outputs directly in
364 python/octave, we can obtain the vector of outputs in native python/octave
365 representation via get_labels().
368 output = svm.apply(feats_test);
369 output_vector = output.get_labels();
373 Given the output and the test labels, we can now assess the prediction
374 performance. For this, we create an instance of the class PerformanceMeasures,
375 which provides a convenient way of obtaining various performance measures, such
376 as accuracy (acc), area under the receiver operator characteristic curve
377 (auROC), F-score and others:
380 pm = PerformanceMeasures(labels_test, output);
381 acc = pm.get_accuracy();
382 roc = pm.get_auROC();
383 fms = pm.get_fmeasure();
386 That's really it. For any of the advanced topics please have a look one of the
387 \b many self-explanatory examples in
389 \li examples/octave-modular also available online \subpage octave_modular_examples "here"
390 \li examples/python-modular also available online \subpage python_modular_examples "here"
392 A full, working example similar to the one presented above is shown below:
396 \verbinclude classifier_libsvm_minimal_modular.m
400 \verbinclude classifier_libsvm_minimal_modular.java
404 \verbinclude classifier_libsvm_minimal_modular.lua
408 \verbinclude classifier_libsvm_minimal_modular.rb
412 \verbinclude classifier_libsvm_minimal_modular.cs
415 \verbinclude classifier_libsvm_minimal_modular.py