en/current/Methods_8mainpage_source.html

/*! \page methods Methods


  Shogun's ML functionality is currently split into feature

  representations, feature preprocessors, kernels, kernel normalizers,

  distances, classifier, clustering algorithms, distributions,

  performance evaluation measures, regression methods, structured

  output learners. The following gives a brief overview over all the

  ML related Algorithms/Classes/Methods implemented within shogun.


  \section featrep_sec Feature Representations

  Shogun supports a wide range of feature representations. Among them are the so

  called simple features (cf., CSimpleFeatures) that are standard 2-d Matrices,

  strings (cf., CStringFeatures) that however in contrast to other meanings of

  string are just a list of vectors of arbitrary length and sparse features

  (cf., CSparseFeatures) to efficiently represent sparse matrices.


  Each of these feature objects


  \li Simple Features (CSimpleFeatures)

  \li Strings (CStringFeatures)

  \li Sparse Features (CSparseFeatures)


  supports any of the standard types from bool to floats:


  Supported Types

  \li bool

  \li 8bit char

  \li 8bit Byte

  \li 16bit Integer

  \li 16bit Word

  \li 32bit Integer

  \li 32bit Unsigned Integer

  \li 32bit Float matrix

  \li 64bit Float matrix

  \li 96bit Float matrix


  Many other feature types available. Some of them are based on the three basic

  feature types above, like CTOPFeatures (TOP Kernel features from CHMM),

  CFKFeatures (Fisher Kernel features from CHMM) and CRealFileFeatures (vectors

  fetched from a binary file). It should be noted that all

  feature objects are derived from CFeatures

  More complex

  \li CAttributeFeatures - Features of attribute value pairs.

  \li CCombinedDotFeatures -  Features that allow stacking of dot features.

  \li CCombinedFeatures - Features that allow stacking of arbitrary features.

  \li CDotFeatures - Features that support a certain set of features (like multiplication with a scalar + addition to a dense vector). Examples are sparse and dense features.

  \li CDummyFeatures - Features without content; Only number of vectors is known.

  \li CExplicitSpecFeatures - Implement spectrum kernel feature space explicitly.

  \li CImplicitWeightedSpecFeatures - DotFeatures that implicitly implement weighted spectrum kernel features.

  \li CFactorGraphFeatures - Maintains arrays of feature graphs

  \li CImplicitWeightedSpecFeatures - Features that compute the Weighted Spectrum Kernel feature space

  \li CLatentFeatures - Features for latent learning

  \li CLBPPyrDotFeatures - Local Binary Patterns with Scale Pyramids as dot features

  \li CPolyFeatures - Implement DotFeatures for the polynomial kernel

  \li CRandomFourierDotFeatures - Implement the random fourier features for the DotFeatures

  \li CRandomKitchenSinksDotFeatures - Implement the Random Kitchen Sinks (RKS) for the DotFeatures

  \li CSparsePolyFeatures - Implement DotFeatures for the polynomial kernel

  \li CWDFeatures - DotFeatures that implicitly implement weighted degree kernel features


  In addition, labels are represented in CLabels and the alphabet of a string in

  CAlphabet.


  \section preproc_sec Preprocessors

  The aforementioned features can be on-the-fly preprocessed to e.g. subtract the mean or normalize vectors to norm 1 etc. The following pre-processors are implemented

  \li CDimensionReductionPreprocessor - Lowers dimensionality of dense matrices

  \li CFisherLDA - Performs linear discriminant analysis on input feature vectors/matrices

  \li CHomogeneousKernelMap - An additive kernel map

  \li CKernelPCA - Performs kernel principal component analysis

  \li CNormOne - Normalizes vectors to norm 1

  \li CLogPlusOne - Adds 1 and applies log()

  \li CPCACut - Keeps eigenvectors with the highest eigenvalues

  \li CPNorm - Normalizes vectors to have p-norm

  \li CPruneVarSubMean - Removes dimensions with little variance, subtracting the mean

  \li CRandomFourierGaussPreproc - Random Fourier Features for the Gauss kernel

  \li CRescaleFeatures - Rescale range of features to independent ranges

  \li CSortUlongString - Sorts vectors

  \li CSortWordString - Sorts vectors

  \li CSumOne - Normalizes vectors to have sum 1


  \section classifiers_sec Classifiers


  A multitude of Classifiers are implemented in shogun. Among them are several

  standard 2-class classifiers, 1-class classifiers and multi-class

  classifiers. Several of them are linear classifiers and SVMs. Among the

  fastest linear SVM-classifiers are CSGD, CSVMOcas and CLibLinear (capable of

  dealing with millions of examples and features).


  \subsection linclassi_sec Linear Classifiers

  \li CPerceptron - Standard online perceptron

  \li CAveragedPerceptron - A simple extension of the standard perceptron.

  \li CFeatureBlockLogisticRegression - A linear binary logistic loss classifier

  \li CGaussianProcessClassification - A binary multiclass classifier based on Gaussian Processes

  \li CLDA - Fishers linear discriminant

  \li CLPM - Linear programming machine (1-norm regularized SVM)

  \li CLPBoost - Linear programming machine using boosting on the features

  \li CNearestCentroid - A nearest shrunk centroid classifier

  \li CPluginEstimate - A classifier that takes two CHMMLinear probabilistic models as input


  \subsubsection svmclassi_sec Support Vector Machines

  \li CSVM - A generic Support Vector Machine (SVM)

  \li CLibLinear - A linear SVM with l2-regularized bias

  \li COnlineLibLinear - A purely online version of CLibLinear

  \li CSVMLin - A linear SVM with l2-regularized bias

  \li CSVMOcas - A linear SVM with l2-regularized bias

  \li CSVMLight - A variant of SVMlight using pr_loqo as its internal solver

  \li CLibSVM - LibSVM modified to use shoguns kernel framework

  \li CSVMSGD - An SVM with stochastic gradient

  \li COnlineSVMSGD - A purely online version of SVM with stochastic gradient

  \li CMPDSVM - Minimal Primal Dual SVM

  \li CGPBTSVM - Gradient Projection Technique SVM

  \li CWDSVMOcas - CSVMOcas based SVM using explicitly spanned WD-Kernel feature space

  \li CGNPPSVM - SVM solver based on the generalized nearest point problem

  \li CNewtonSVM - linear primal SVM trained using Newton-like iterations

  \li CSGDQN - An SVM with Quasi-Newton stochastic gradient

  \li CGMNPSVM - A true multiclass one vs. rest SVM

  \li CMCSVM - An experimental multiclass SVM

  \li CLibSVMMultiClass - LibSVMs one vs. one multiclass SVM solver

  \li CLibSVMOneClass - LibSVMs one-class SVM


  \subsubsection vwclassi_sec Vowpal Wabbit

  \li CVowpalWabbit - An implementation of online learning algorithm used in Vowpal Wabbit


  \subsection distmachine_sec Distance Machines

  \li k-Nearest Neighbor - Standard k-NN


  \section regression_sec Regression

  \li CLeastSquaresRegression - Least square regression

  \li CGaussianProcessRegression - Regression based on Gaussian Processes


  \subsection Support Vector Regression

  \li CSVRLight - SVMLight based SVR

  \li CLibSVR - LIBSVM based SVR

  \li CMKLRegression - Multiple Kernel Learning for regression


  \subsection other_regress Others

  \li CLinearRidgeRegression - A regularized least square method for classification and regression

  \li CKernelRidgeRegression - A regularized least square method for classification and regression

  \li CLeastAngleRegression - To solve L1 regularized least square regression


  \section distrib_sec Distributions

  \li CHMM - Hidden Markov Models

  \li CEMMixtureModel - EM specialized for Mixture models

  \li CHistogram - Histogram

  \li CGaussian - Gaussian Distribution

  \li CKernelDensity - Kernel Density Estimation technique

  \li CLinearHMM - Markov chains (embedded in ``Linear'' HMMs)

  \li CPositionalPWM - Postional PWM

  \li CMixtureModel - Mixture of various simple distributions


  \subsection distrib_cla Classical Distributions

  \li CGaussianDistribution - Dense version of well-known Gaussian Distribution


  \section cluster_sec Clustering

  \li CHierarchical - Agglomerative hierarchical single linkage clustering.

  \li CKMeans - k-Means Clustering


  \section mkl_sec Multiple Kernel Learning

  \li CMKL - A support vector machine based method for use with multiple kernels

  \li CMKLOneClass - Multiple Kernel Learning for one-class-classification

  \li CMKLClassification - Multiple Kernel Learning for two-class-classification

  \li CMKLMulticlass - L1-norm Multiple Kernel Learning for multiclass classification


  \section kernels_sec Kernels

  \li CANOVAKernel - ANalysis Of VAriances (ANOVA) kernel

  \li CAUCKernel - To maximize AUC in SVM training (takes a kernel as input)

  \li CBesselKernel - Bessel Kernel

  \li CCauchyKernel - Cauchy Kernel

  \li CChi2Kernel - Chi^2 Kernel

  \li CCircularKernel - Circular kernel

  \li CCombinedKernel - Combined kernel to work with multiple kernels

  \li CCommUlongStringKernel - Spectrum Kernel with spectrums of up to 64bit

  \li CCommWordStringKernel - Spectrum kernel with spectrum of up to 16 bit

  \li CConstKernel - A ''kernel'' returning a constant

  \li CCustomKernel - A user supplied custom kernel

  \li CDiagKernel - A kernel with nonzero elements only on the diagonal

  \li CDistanceKernel - A transformation to transform distances into similarities

  \li CFixedDegreeStringKernel - A string kernel

  \li CExponentialKernel - A kernel closely related to the Gaussian Kernel

  \li CGaussianARDKernel - A Gaussian Kernel with Automatic Relevance Detection

  \li CGaussianKernel - The standard Gaussian kernel

  \li CGaussianMatchStringKernel - A variant of the Gaussian kernel on strings of same length

  \li CGaussianShiftKernel - Gaussian kernel with shift (inspired by the Weighted Degree shift kernel

  \li CGaussianShortRealKernel - Gaussian Kernel on 32bit Floats

  \li CHistogramIntersectionKernel - A kernel that computes the histogram intersection distance between sets of histograms

  \li CHistogramWordStringKernel - A TOP kernel on Sequences

  \li CInverseMultiQuadricKernel - Inverse MultiQuadric kernel

  \li CJensenShannonKernel - The Jensen-Shannon kernel

  \li CLinearARDKernel - Linear Kernel with Automatic Relevance Detection

  \li CLinearKernel - Linear Kernel

  \li CLinearStringKernel - Linear Kernel on Strings

  \li CLocalAlignmentStringKernel - kernel that compares two sequences through all possible local alignments

  \li CLocalityImprovedStringKernel - A kernel inspired from polynomial kernel that emphasizes on local features

  \li CLogKernel - Log Kernel

  \li CLocalAlignmentStringKernel - The local alignment kernel

  \li CLocalityImprovedStringKernel - The locality improved kernel

  \li CMatchWordStringKernel - Another String kernel

  \li CMultiquadricKernel - MultiQuadric kernel

  \li COligoStringKernel - The oligo string kernel

  \li CPolyKernel - The standard polynomial kernel

  \li CPolyMatchStringKernel - Polynomial kernel on strings

  \li CPolyMatchWordStringKernel - Polynomial kernel on strings

  \li CPowerKernel - Power Kernel

  \li CProductKernel - Combines a number of kernels into a single ProductKernel

  \li CPyramidChi2 - Pyramid chi2 kernel (from image analysis)

  \li CRationalQuadraticKernel - Rational Quadratic kernel

  \li CRegulatoryModulesStringKernel - regulatory modules string kernel

  \li CSalzbergWordStringKernel - salzberg features based string kernel

  \li CSigmoidKernel - Tanh sigmoidal kernel

  \li CSimpleLocalityImprovedStringKernel - A variant of the locality improved kernel

  \li CSNPStringKernel - A variant of the polynomial kernel on strings of same length

  \li CSparseSpatialSampleStringKernel - Sparse Spatial Sample String Kernel

  \li CSpectrumMismatchRBFKernel - Spectrum mismatch rbf kernel

  \li CSpectrumRBFKernel - Spectrum rbf kernel

  \li CSphericalKernel - Spherical kernel

  \li CSplineKernel - The Spline Kernel (function is the cubic polynomial)

  \li CSubsequenceStringKernel - Subsequence String Kernel (SSK)

  \li CTensorProductPairKernel - The Tensor Product Pair Kernel (TPPK)

  \li CTStudentKernel - Generalized T-Student Kernel

  \li CWaveKernel - Wave kernel

  \li CWaveletKernel - Wavelet kernel

  \li CWeightedCommWordStringKernel - A weighted (or blended) spectrum kernel

  \li CWeightedDegreePositionStringKernel - Weighted Degree kernel with shift

  \li CWeightedDegreeRBFKernel - Weighter Degree rbf kernel

  \li CWeightedDegreeStringKernel - Weighted Degree string kernel


  \subsection kernel_normalizer Kernel Normalizers

  Since several of the kernels pose numerical challenges to SVM optimizers,

  kernels can be ``normalized'' for example to have ones on the diagonal.


  \li CSqrtDiagKernelNormalizer - divide kernel by square root of product of diagonal

  \li CAvgDiagKernelNormalizer - divide by average diagonal value

  \li CFirstElementKernelNormalizer - divide by first kernel element k(0,0)

  \li CIdentityKernelNormalizer - no normalization

  \li CDiceKernelNormalizer - normalization inspired by the dice coefficient

  \li CRidgeKernelNormalizer - adds a ridge on the kernel diagonal

  \li CScatterKernelNormalizer - scatter-normalized kernel

  \li CTanimotoKernelNormalizer - tanimoto coefficient inspired normalizer

  \li CVarianceKernelNormalizer - normalize vectors in feature space to norm 1

  \li CZeroMeanCenterKernelNormalizer - centers the kernel in feature space


  \section dist_sec Distances

  Distance Measures to measure the distance between objects. They can be used

  in CDistanceMachine's like CKNN. The following distances are implemented:


  \li CAttenuatedEuclideanDistance - Euclidean Distance (ignoring 0s)

  \li CBrayCurtisDistance - Bray curtis distance

  \li CCanberraMetric - Canberra metric

  \li CCanberraWordDistance - Canberra metric for words

  \li CChebyshewMetric - Chebyshew metric

  \li CChiSquareDistance - Chi^2 distance

  \li CCosineDistance - Cosine distance

  \li CCustomDistance - User provided custom distances

  \li CCustomMahalanobisDistance - Vector based calculation for Mahalanobis Distance

  \li CEuclidianDistance - Euclidian Distance

  \li CGeodesicMetric - Geodesic metric

  \li CHammingWordDistance - Hamming Distance

  \li CJensenMetric - Jensen metric

  \li CMahalanobisDistance - Mahalanobis Distance

  \li CManhattanMetric - Manhattan metric

  \li CManhattanWordDistance - Manhattan distance for words

  \li CMinkowskiMetric - Minkowski metric

  \li CSparseEuclideanDistance - Sparse Euclidean Distance

  \li CTanimotoDistance - Tanimoto distance


  \section eval_sec Evaluation

  \subsection perf_sec Performance Measures

  Performance measures assess the quality of a prediction and are implemented

  in CPerformanceMeasures. They following measures are implemented:

  \li  Receiver Operating Curve (ROC)

  \li  Area under the ROC curve (auROC)

  \li  Area over the ROC curve (aoROC)

  \li  Precision Recall Curve (PRC)

  \li  Area under the PRC (auPRC)

  \li  Area over the PRC (aoPRC)

  \li  Detection Error Tradeoff (DET)

  \li  Area under the DET (auDET)

  \li  Area over the DET (aoDET)

  \li  Cross Correlation coefficient (CC)

  \li  Weighted Relative Accuracy (WRAcc)

  \li  Balanced Error (BAL)

  \li  F-Measure

  \li  Accuracy

  \li  Error


*/