SHOGUN  4.1.0
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Modules Pages
ModularTutorial.mainpage
Go to the documentation of this file.
1 /*!
2 \page modular_tutorial Tutorial for Modular Interfaces
3 
4 SHOGUN's "modular" interfaces to Python, Octave, Java, Lua, Ruby and C# give
5 intuitive and easy access to the shogun's functionality. Compared to the static
6 interfaces (\subpage staticinterfaces), the modular ones are much more flexible
7 and allow for nearly unlimited extensibility.
8 
9 If this is your first time using shogun, you've found the right place to start!
10 
11 
12 In this tutorial, we demonstrate how to use shogun to create a simple Gaussian
13 kernel based Support Vector Machine (SVM) classifier, but first things first.
14 Let's fire up python, octave, java, lua, ruby or C#, and load the modular shogun
15 environment.
16 
17 \section start_shogun_modular Starting SHOGUN
18 
19 To load all of shogun's modules under octave start octave and issue
20 \verbatim
21 init_shogun
22 \endverbatim
23 
24 Under python we have to specify what we'd like to import.
25 For this example, we will need features and labels to represent our input data,
26 a classifier, a kernel to compare the similarity of features and
27 some evaluation procedures.
28 
29 \verbatim
30 from shogun.Features import *
31 from shogun.Kernel import *
32 from shogun.Classifier import *
33 from shogun.Evaluation import *
34 \endverbatim
35 
36 Under java we need import shogun package and jblas package.
37 
38 \verbatim
39 import org.shogun.*;
40 import org.jblas.*;
41 \endverbatim
42 
43 Under lua We need set LUA_PATH and LUA_CPATH to find the *.lua lib and *.so lib
44 that lua_modular requires. For instance, if current dir is examples/undocumented
45 /lua_modular, we need set the path as the following when we want to test the
46 examples:
47 
48 \verbatim
49 export LUA_PATH=../../../src/interfaces/lua_modular/?.lua\;?.lua
50 export LUA_CPATH=../../../src/interfaces/lua_modular/?.so
51 \endverbatim
52 
53 Under ruby we need import shogun package and narray package.
54 
55 \verbatim
56 require 'modshogun'
57 require 'narray'
58 \endverbatim
59 
60 Under c# we need import system package.
61 
62 \verbatim
63 using System;
64 \endverbatim
65 
66 we need import shogun package, which includes all the sub package, such
67 as kernel, classifier, distance and so on.
68 
69 \verbatim
70 require('shogun')
71 \endverbatim
72 
73 we also have to specify what we'd like to load.
74 For this example, we will need features and labels to represent our input data,
75 a classifier, a kernel to compare the similarity of features
76 
77 \verbatim
78 System.loadLibrary("Features");
79 System.loadLibrary("Classifier");
80 System.loadLibrary("Kernel");
81 \endverbatim
82 
83 We also need use the method of Features to init shogun as the following:
84 \verbatim
85 Features.init_shogun_with_defaults();
86 \endverbatim
87 
88 (Note here that the module shogun.Features contains the class Labels).
89 
90 \section docu_shogun_modular Getting Help
91 
92 If you would like to get an overview of the classes in shogun or
93 if you are looking for the documentation of a particular class,
94 use the "Classes" tab above and browse through the class list.
95 All classes are rather well documented (if documentation is not good enough in a
96 particular class, please notify us). Note that all classes in the
97 classes list (classes tab above) are all prefixed with a 'C' - that
98 really is the only difference to using them from within the python or
99 octave modular interfaces (where the C prefix is missing).
100 
101 Alternatively, under python the same documentation is available via
102 python help strings, so you may issue
103 
104 \verbatim
105 help(<classname>)
106 \endverbatim
107 
108 for example
109 \verbatim
110 from shogun.Kernel import GaussianKernel
111 help(GaussianKernel)
112 \endverbatim
113 
114 or
115 
116 \verbatim
117 import shogun.Kernel
118 help(shogun.Kernel)
119 \endverbatim
120 
121 
122 \section toy_tutorial_modular Generating a toy dataset
123 
124 To start with, we will generate a small toy dataset. In python, we need to
125 import numpy to do this. We will generate real valued training (and testing)
126 data drawn from Gaussians distribution. We will generate two Gauss bumps
127 that are "dist" apart. The data is in matrix shape with each column
128 describing an object and as many columns as we have examples. Additionally,
129 we need a vector of labels (that is a vector of ones or minus ones) that
130 indicates the class of each example.
131 
132 \verbatim
133 from numpy import *
134 from numpy.random import randn
135 dist=0.5
136 traindata_real = concatenate((randn(2,100)-dist, randn(2,100)+dist), axis=1)
137 testdata_real = concatenate((randn(2,100)-dist, randn(2,100)+dist), axis=1)
138 train_labels = concatenate((-ones(100), ones(100)))
139 test_labels = concatenate((-ones(100), ones(100)))
140 \endverbatim
141 
142 In octave, this is is done as follows:
143 
144 \verbatim
145 dist=0.5
146 traindata_real = [randn(2,100)-dist, randn(2,100)+dist];
147 testdata_real = [randn(2,100)-dist, randn(2,100)+dist];
148 train_labels = [-ones(1,100), ones(1,100)];
149 test_labels = [-ones(1,100), ones(1,100)];
150 \endverbatim
151 
152 In java, this is is done as follows:
153 \verbatim
154 dist=0.5
155 int num = 1000;
156 double dist = 1.0;
157 
158 DoubleMatrix offs=ones(2, num).mmul(dist);
159 DoubleMatrix x = randn(2, num).sub(offs);
160 DoubleMatrix y = randn(2, num).add(offs);
161 DoubleMatrix traindata_real = concatHorizontally(x, y);
162 
163 DoubleMatrix m = randn(2, num).sub(offs);
164 DoubleMatrix n = randn(2, num).add(offs);
165 DoubleMatrix testdata_real = concatHorizontally(m, n);
166 
167 DoubleMatrix o = ones(1,num);
168 DoubleMatrix trainlab = concatHorizontally(o.neg(), o);
169 DoubleMatrix testlab = concatHorizontarily(o.neg(), o)
170 \endverbatim
171 
172 In lua, this is is done as follows:
173 \verbatim
174 require 'shogun'
175 require 'load'
176 
177 function concatenate(...)
178  local result = ...
179  for _,t in ipairs{select(2, ...)} do
180  for row,rowdata in ipairs(t) do
181  for col,coldata in ipairs(rowdata) do
182  table.insert(result[row], coldata)
183  end
184  end
185  end
186  return result
187 end
188 
189 function rand_matrix(rows, cols, dist)
190  local matrix = {}
191  for i = 1, rows do
192  matrix[i] = {}
193  for j = 1, cols do
194  matrix[i][j] = math.random() + dist
195  end
196  end
197  return matrix
198 end
199 
200 function ones(num)
201  r={}
202  for i=1,num do
203  r[i]=1
204  end
205  return r
206 end
207 
208 
209 num=1000
210 dist=1
211 width=2.1
212 C=1
213 
214 traindata_real=concatenate(rand_matrix(2,num, -dist),rand_matrix(2,num,dist))
215 testdata_real=concatenate(rand_matrix(2,num,-dist), rand_matrix(2,num, dist))
216 trainlab={}
217 for i = 1, num do
218  trainlab[i] = -1
219  trainlab[i + num] = 1
220 end
221 
222 testlab={}
223 for i = 1, num do
224  testlab[i] = -1
225  testlab[i + num] = 1
226 end
227 
228 \endverbatim
229 
230 In ruby, this is is done as follows:
231 \verbatim
232 @num = 1000
233 @dist = 1
234 @width = 2.1
235 C = 1
236 
237 def gen_rand_ary
238  ary = [[],[]]
239  ary.each do |p|
240  p << ary_fill( @dist ) + ary_fill( -@dist )
241  p.flatten!
242  end
243  return ary
244 end
245 
246 def gen_ones_vec
247  ary = []
248  @num.times do
249  ary << -1
250  end
251  @num.times do
252  ary << 1
253  end
254  return ary
255 end
256 
257 puts "generating training data"
258 traindata_real = gen_rand_ary
259 testdata_real = gen_rand_ary
260 
261 puts "generating labels"
262 trainlab = gen_ones_vec
263 testlab = gen_ones_vec
264 \endverbatim
265 
266 In C#, this is is done as follows:
267 int num = 1000;
268 double dist = 1.0;
269 double width = 2.1;
270 double C = 1.0;
271 
272 Random RandomNumber = new Random();
273 
274 double[,] traindata_real = new double[2, num * 2];
275 for (int i = 0; i < num; i ++) {
276  traindata_real[0, i] = RandomNumber.NextDouble() - dist;
277  traindata_real[0, i + num] = RandomNumber.NextDouble() + dist;
278  traindata_real[1, i] = RandomNumber.NextDouble() - dist;
279  traindata_real[1, i + num] = RandomNumber.NextDouble() + dist;
280 }
281 
282 double[,] testdata_real = new double[2, num * 2];
283 for (int i = 0; i < num; i ++) {
284  testdata_real[0, i] = RandomNumber.NextDouble() - dist;
285  testdata_real[0, i + num] = RandomNumber.NextDouble() + dist;
286  testdata_real[1, i] = RandomNumber.NextDouble() - dist;
287  testdata_real[1, i + num] = RandomNumber.NextDouble() + dist;
288 }
289 
290 double[] trainlab = new double[num * 2];
291 for (int i = 0; i < num; i ++) {
292  trainlab[i] = -1;
293  trainlab[i + num] = 1;
294 }
295 
296 double[] testlab = new double[num * 2];
297 for (int i = 0; i < num; i ++) {
298  testlab[i] = -1;
299  testlab[i + num] = 1;
300 }
301 
302 The rest of this tutorial below will now work the same (identical syntax) for
303 python, octave (when using a trailing semicolon for each command, which is
304 optional in python), lua(we use colon to get the method of an object), ruby and
305 C#
306 
307 For java, we use DoubleMatrix to represent most of the types. If shogun C++
308 class need a int Vector/Matrix, we convert the DoubleMatrix into int
309 Vector/Matrix and when returned from the function, we convert the int
310 Vector/Matrix into DoubleMatrix again.
311 
312 \section svm_tutorial_modular Creating an SVM classifier
313 
314 
315 To process the above toy data in shogun, we need to create a shogun feature
316 object, here RealFeatures (for dense real valued feature matrices, see also
317 shogun::CSimpleFeatures) like this
318 
319 \verbatim
320 feats_train = RealFeatures(traindata_real);
321 feats_test = RealFeatures(testdata_real);
322 \endverbatim
323 
324 Using the above feature object we can now create a kernel object. Here, we
325 create a Gaussian kernel of a certain width (see also shogun::CGaussianKernel)
326 based on our training features
327 
328 \verbatim
329 width = 2;
330 kernel = GaussianKernel(feats_train, feats_train, width);
331 \endverbatim
332 
333 and can now compute the kernel matrix
334 \verbatim
335 km = kernel.get_kernel_matrix();
336 \endverbatim
337 
338 To train an SVM, we need labeled examples, which is a vector of ones and minus
339 ones, such as the one we have previously stored in the variable train_labels. We
340 now create a shogun label object from it (the same goes for our test labels,
341 which we'll use later):
342 
343 \verbatim
344 labels = Labels(train_labels);
345 labels_test = Labels(test_labels);
346 \endverbatim
347 
348 Given the labels object and the kernel all that is left to do is to specify a
349 cost parameter C (used to control generalization performance) and we can
350 construct an SVM object. To start the training, we simply invoke the train
351 method of the SVM object. Quiet easy, isn't it?
352 
353 \verbatim
354 C = 1.0;
355 svm = LibSVM(C, kernel, labels);
356 svm.train();
357 \endverbatim
358 
359 To apply the SVM to unseen test data, we simply need to pass a feature object to
360 the SVM's apply method, which returns a shogun Label object (note that we could
361 alternatively initialize the kernel object with the train and test data manually
362 and then call apply without arguments, which is done in some of the other
363 example scripts). If we would like to analyze the outputs directly in
364 python/octave, we can obtain the vector of outputs in native python/octave
365 representation via get_labels().
366 
367 \verbatim
368 output = svm.apply(feats_test);
369 output_vector = output.get_labels();
370 \endverbatim
371 
372 
373 Given the output and the test labels, we can now assess the prediction
374 performance. For this, we create an instance of the class PerformanceMeasures,
375 which provides a convenient way of obtaining various performance measures, such
376 as accuracy (acc), area under the receiver operator characteristic curve
377 (auROC), F-score and others:
378 
379 \verbatim
380 pm = PerformanceMeasures(labels_test, output);
381 acc = pm.get_accuracy();
382 roc = pm.get_auROC();
383 fms = pm.get_fmeasure();
384 \endverbatim
385 
386 That's really it. For any of the advanced topics please have a look one of the
387 \b many self-explanatory examples in
388 
389 \li examples/octave-modular also available online \subpage octave_modular_examples "here"
390 \li examples/python-modular also available online \subpage python_modular_examples "here"
391 
392 A full, working example similar to the one presented above is shown below:
393 
394 For octave:
395 
396 \verbinclude classifier_libsvm_minimal_modular.m
397 
398 For java:
399 
400 \verbinclude classifier_libsvm_minimal_modular.java
401 
402 For lua:
403 
404 \verbinclude classifier_libsvm_minimal_modular.lua
405 
406 For ruby:
407 
408 \verbinclude classifier_libsvm_minimal_modular.rb
409 
410 For csharp:
411 
412 \verbinclude classifier_libsvm_minimal_modular.cs
413 
414 For python:
415 \verbinclude classifier_libsvm_minimal_modular.py
416 */

SHOGUN Machine Learning Toolbox - Documentation