Shogun's ML functionality is currently split into feature representations, feature preprocessors, kernels, kernel normalizers, distances, classifier, clustering algorithms, distributions, performance evaluation measures, regression methods, structured output learners. The following gives a brief overview over all the ML related Algorithms/Classes/Methods implemented within shogun.
Feature Representations
Shogun supports a wide range of feature representations. Among them are the so called simple features (cf., CSimpleFeatures) that are standard 2-d Matrices, strings (cf., CStringFeatures) that however in contrast to other meanings of string are just a list of vectors of arbitrary length and sparse features (cf., CSparseFeatures) to efficiently represent sparse matrices.
Each of these feature objects
- Simple Features (CSimpleFeatures)
- Strings (CStringFeatures)
- Sparse Features (CSparseFeatures)
supports any of the standard types from bool to floats:
Supported Types
- bool
- 8bit char
- 8bit Byte
- 16bit Integer
- 16bit Word
- 32bit Integer
- 32bit Unsigned Integer
- 32bit Float matrix
- 64bit Float matrix
- 96bit Float matrix
Many other feature types available. Some of them are based on the three basic feature types above, like CTOPFeatures (TOP Kernel features from CHMM), CFKFeatures (Fisher Kernel features from CHMM) and CRealFileFeatures (vectors fetched from a binary file). It should be noted that all feature objects are derived from CFeatures More complex
- CAttributeFeatures - Features of attribute value pairs.
- CCombinedDotFeatures - Features that allow stacking of dot features.
- CCombinedFeatures - Features that allow stacking of arbitrary features.
- CDotFeatures - Features that support a certain set of features (like multiplication with a scalar + addition to a dense vector). Examples are sparse and dense features.
- CDummyFeatures - Features without content; Only number of vectors is known.
- CExplicitSpecFeatures - Implement spectrum kernel feature space explicitly.
- CImplicitWeightedSpecFeatures - DotFeatures that implicitly implement weighted spectrum kernel features.
- CFactorGraphFeatures - Maintains arrays of feature graphs
- CImplicitWeightedSpecFeatures - Features that compute the Weighted Spectrum Kernel feature space
- CLatentFeatures - Features for latent learning
- CLBPPyrDotFeatures - Local Binary Patterns with Scale Pyramids as dot features
- CPolyFeatures - Implement DotFeatures for the polynomial kernel
- CRandomFourierDotFeatures - Implement the random fourier features for the DotFeatures
- CRandomKitchenSinksDotFeatures - Implement the Random Kitchen Sinks (RKS) for the DotFeatures
- CSparsePolyFeatures - Implement DotFeatures for the polynomial kernel
- CWDFeatures - DotFeatures that implicitly implement weighted degree kernel features
In addition, labels are represented in CLabels and the alphabet of a string in CAlphabet.
Preprocessors
The aforementioned features can be on-the-fly preprocessed to e.g. subtract the mean or normalize vectors to norm 1 etc. The following pre-processors are implemented
- CDimensionReductionPreprocessor - Lowers dimensionality of dense matrices
- CFisherLDA - Performs linear discriminant analysis on input feature vectors/matrices
- CHomogeneousKernelMap - An additive kernel map
- CKernelPCA - Performs kernel principal component analysis
- CNormOne - Normalizes vectors to norm 1
- CLogPlusOne - Adds 1 and applies log()
- CPCACut - Keeps eigenvectors with the highest eigenvalues
- CPNorm - Normalizes vectors to have p-norm
- CPruneVarSubMean - Removes dimensions with little variance, subtracting the mean
- CRandomFourierGaussPreproc - Random Fourier Features for the Gauss kernel
- CRescaleFeatures - Rescale range of features to independent ranges
- CSortUlongString - Sorts vectors
- CSortWordString - Sorts vectors
- CSumOne - Normalizes vectors to have sum 1
Classifiers
A multitude of Classifiers are implemented in shogun. Among them are several standard 2-class classifiers, 1-class classifiers and multi-class classifiers. Several of them are linear classifiers and SVMs. Among the fastest linear SVM-classifiers are CSGD, CSVMOcas and CLibLinear (capable of dealing with millions of examples and features).
Linear Classifiers
- CPerceptron - Standard online perceptron
- CAveragedPerceptron - A simple extension of the standard perceptron.
- CFeatureBlockLogisticRegression - A linear binary logistic loss classifier
- CGaussianProcessClassification - A binary multiclass classifier based on Gaussian Processes
- CLDA - Fishers linear discriminant
- CLPM - Linear programming machine (1-norm regularized SVM)
- CLPBoost - Linear programming machine using boosting on the features
- CNearestCentroid - A nearest shrunk centroid classifier
- CPluginEstimate - A classifier that takes two CHMMLinear probabilistic models as input
Support Vector Machines
- CSVM - A generic Support Vector Machine (SVM)
- CLibLinear - A linear SVM with l2-regularized bias
- COnlineLibLinear - A purely online version of CLibLinear
- CSVMLin - A linear SVM with l2-regularized bias
- CSVMOcas - A linear SVM with l2-regularized bias
- CSVMLight - A variant of SVMlight using pr_loqo as its internal solver
- CLibSVM - LibSVM modified to use shoguns kernel framework
- CSVMSGD - An SVM with stochastic gradient
- COnlineSVMSGD - A purely online version of SVM with stochastic gradient
- CMPDSVM - Minimal Primal Dual SVM
- CGPBTSVM - Gradient Projection Technique SVM
- CWDSVMOcas - CSVMOcas based SVM using explicitly spanned WD-Kernel feature space
- CGNPPSVM - SVM solver based on the generalized nearest point problem
- CNewtonSVM - linear primal SVM trained using Newton-like iterations
- CSGDQN - An SVM with Quasi-Newton stochastic gradient
- CGMNPSVM - A true multiclass one vs. rest SVM
- CMCSVM - An experimental multiclass SVM
- CLibSVMMultiClass - LibSVMs one vs. one multiclass SVM solver
- CLibSVMOneClass - LibSVMs one-class SVM
Vowpal Wabbit
- CVowpalWabbit - An implementation of online learning algorithm used in Vowpal Wabbit
Distance Machines
- k-Nearest Neighbor - Standard k-NN
Regression
- CLeastSquaresRegression - Least square regression
- CGaussianProcessRegression - Regression based on Gaussian Processes
Vector Regression
- CSVRLight - SVMLight based SVR
- CLibSVR - LIBSVM based SVR
- CMKLRegression - Multiple Kernel Learning for regression
Others
- CLinearRidgeRegression - A regularized least square method for classification and regression
- CKernelRidgeRegression - A regularized least square method for classification and regression
- CLeastAngleRegression - To solve L1 regularized least square regression
Distributions
- CHMM - Hidden Markov Models
- CEMMixtureModel - EM specialized for Mixture models
- CHistogram - Histogram
- CGaussian - Gaussian Distribution
- CKernelDensity - Kernel Density Estimation technique
- CLinearHMM - Markov chains (embedded in ``Linear'' HMMs)
- CPositionalPWM - Postional PWM
- CMixtureModel - Mixture of various simple distributions
Classical Distributions
- CGaussianDistribution - Dense version of well-known Gaussian Distribution
Clustering
- CHierarchical - Agglomerative hierarchical single linkage clustering.
- CKMeans - k-Means Clustering
Multiple Kernel Learning
- CMKL - A support vector machine based method for use with multiple kernels
- CMKLOneClass - Multiple Kernel Learning for one-class-classification
- CMKLClassification - Multiple Kernel Learning for two-class-classification
- CMKLMulticlass - L1-norm Multiple Kernel Learning for multiclass classification
Kernels
- CANOVAKernel - ANalysis Of VAriances (ANOVA) kernel
- CAUCKernel - To maximize AUC in SVM training (takes a kernel as input)
- CBesselKernel - Bessel Kernel
- CCauchyKernel - Cauchy Kernel
- CChi2Kernel - Chi^2 Kernel
- CCircularKernel - Circular kernel
- CCombinedKernel - Combined kernel to work with multiple kernels
- CCommUlongStringKernel - Spectrum Kernel with spectrums of up to 64bit
- CCommWordStringKernel - Spectrum kernel with spectrum of up to 16 bit
- CConstKernel - A ''kernel'' returning a constant
- CCustomKernel - A user supplied custom kernel
- CDiagKernel - A kernel with nonzero elements only on the diagonal
- CDistanceKernel - A transformation to transform distances into similarities
- CFixedDegreeStringKernel - A string kernel
- CExponentialKernel - A kernel closely related to the Gaussian Kernel
- CGaussianARDKernel - A Gaussian Kernel with Automatic Relevance Detection
- CGaussianKernel - The standard Gaussian kernel
- CGaussianMatchStringKernel - A variant of the Gaussian kernel on strings of same length
- CGaussianShiftKernel - Gaussian kernel with shift (inspired by the Weighted Degree shift kernel
- CGaussianShortRealKernel - Gaussian Kernel on 32bit Floats
- CHistogramIntersectionKernel - A kernel that computes the histogram intersection distance between sets of histograms
- CHistogramWordStringKernel - A TOP kernel on Sequences
- CInverseMultiQuadricKernel - Inverse MultiQuadric kernel
- CJensenShannonKernel - The Jensen-Shannon kernel
- CLinearARDKernel - Linear Kernel with Automatic Relevance Detection
- CLinearKernel - Linear Kernel
- CLinearStringKernel - Linear Kernel on Strings
- CLocalAlignmentStringKernel - kernel that compares two sequences through all possible local alignments
- CLocalityImprovedStringKernel - A kernel inspired from polynomial kernel that emphasizes on local features
- CLogKernel - Log Kernel
- CLocalAlignmentStringKernel - The local alignment kernel
- CLocalityImprovedStringKernel - The locality improved kernel
- CMatchWordStringKernel - Another String kernel
- CMultiquadricKernel - MultiQuadric kernel
- COligoStringKernel - The oligo string kernel
- CPolyKernel - The standard polynomial kernel
- CPolyMatchStringKernel - Polynomial kernel on strings
- CPolyMatchWordStringKernel - Polynomial kernel on strings
- CPowerKernel - Power Kernel
- CProductKernel - Combines a number of kernels into a single ProductKernel
- CPyramidChi2 - Pyramid chi2 kernel (from image analysis)
- CRationalQuadraticKernel - Rational Quadratic kernel
- CRegulatoryModulesStringKernel - regulatory modules string kernel
- CSalzbergWordStringKernel - salzberg features based string kernel
- CSigmoidKernel - Tanh sigmoidal kernel
- CSimpleLocalityImprovedStringKernel - A variant of the locality improved kernel
- CSNPStringKernel - A variant of the polynomial kernel on strings of same length
- CSparseSpatialSampleStringKernel - Sparse Spatial Sample String Kernel
- CSpectrumMismatchRBFKernel - Spectrum mismatch rbf kernel
- CSpectrumRBFKernel - Spectrum rbf kernel
- CSphericalKernel - Spherical kernel
- CSplineKernel - The Spline Kernel (function is the cubic polynomial)
- CSubsequenceStringKernel - Subsequence String Kernel (SSK)
- CTensorProductPairKernel - The Tensor Product Pair Kernel (TPPK)
- CTStudentKernel - Generalized T-Student Kernel
- CWaveKernel - Wave kernel
- CWaveletKernel - Wavelet kernel
- CWeightedCommWordStringKernel - A weighted (or blended) spectrum kernel
- CWeightedDegreePositionStringKernel - Weighted Degree kernel with shift
- CWeightedDegreeRBFKernel - Weighter Degree rbf kernel
- CWeightedDegreeStringKernel - Weighted Degree string kernel
Kernel Normalizers
Since several of the kernels pose numerical challenges to SVM optimizers, kernels can be ``normalized'' for example to have ones on the diagonal.
- CSqrtDiagKernelNormalizer - divide kernel by square root of product of diagonal
- CAvgDiagKernelNormalizer - divide by average diagonal value
- CFirstElementKernelNormalizer - divide by first kernel element k(0,0)
- CIdentityKernelNormalizer - no normalization
- CDiceKernelNormalizer - normalization inspired by the dice coefficient
- CRidgeKernelNormalizer - adds a ridge on the kernel diagonal
- CScatterKernelNormalizer - scatter-normalized kernel
- CTanimotoKernelNormalizer - tanimoto coefficient inspired normalizer
- CVarianceKernelNormalizer - normalize vectors in feature space to norm 1
- CZeroMeanCenterKernelNormalizer - centers the kernel in feature space
Distances
Distance Measures to measure the distance between objects. They can be used in CDistanceMachine's like CKNN. The following distances are implemented:
- CAttenuatedEuclideanDistance - Euclidean Distance (ignoring 0s)
- CBrayCurtisDistance - Bray curtis distance
- CCanberraMetric - Canberra metric
- CCanberraWordDistance - Canberra metric for words
- CChebyshewMetric - Chebyshew metric
- CChiSquareDistance - Chi^2 distance
- CCosineDistance - Cosine distance
- CCustomDistance - User provided custom distances
- CCustomMahalanobisDistance - Vector based calculation for Mahalanobis Distance
- CEuclidianDistance - Euclidian Distance
- CGeodesicMetric - Geodesic metric
- CHammingWordDistance - Hamming Distance
- CJensenMetric - Jensen metric
- CMahalanobisDistance - Mahalanobis Distance
- CManhattanMetric - Manhattan metric
- CManhattanWordDistance - Manhattan distance for words
- CMinkowskiMetric - Minkowski metric
- CSparseEuclideanDistance - Sparse Euclidean Distance
- CTanimotoDistance - Tanimoto distance
Evaluation
Performance Measures
Performance measures assess the quality of a prediction and are implemented in CPerformanceMeasures. They following measures are implemented:
- Receiver Operating Curve (ROC)
- Area under the ROC curve (auROC)
- Area over the ROC curve (aoROC)
- Precision Recall Curve (PRC)
- Area under the PRC (auPRC)
- Area over the PRC (aoPRC)
- Detection Error Tradeoff (DET)
- Area under the DET (auDET)
- Area over the DET (aoDET)
- Cross Correlation coefficient (CC)
- Weighted Relative Accuracy (WRAcc)
- Balanced Error (BAL)
- F-Measure
- Accuracy
- Error