Strong linear classifier/regressor, also strong non-linear method when using a kernel (polynomial,RBF). Very popular in early 2000s.
Benefits
- O(nd) classification for n support vectors and d features.
Challenges
- Kernel function can be expensive
- Can sometime output a lot of cofefficients (support vectors)
- Training slow on large datasets, O(n^3)
Variations
- Soft-margin, to reduce effects of noise/outliers near margin
- Sparse methods. Reducing number of support vectors
- Approximations. For instance using Nyström method. sklearn Nystroem/RBFSampler
- Sampling to speed up training. Usually only on linear SVM. Stocastic Gradient Descent (SGD), Sequential minimal optimization (SMO)
- Transductive, for semi-supervised learning
- Support vector clustering
- One-class SVM, for anomaly detection
References
- A basic soft-margin kernel SVM implementation in Python, incl RBF/polynomial/quadratic.
- Effects of Reduced Precision on Floating-Point SVM Classification Accuracy. No dataset required a precision higher than 15 bit.
- A Hardware-friendly Support Vector Machine for Embedded Automotive Applications. Used down to 12 bit without significant reduction in performance.
- Approximate RBF Kernel SVM and Its Applications in Pedestrian Classification. Paper presents an O(d*(d+3)/2) implementation to the nonlinear RBF-kernel SVM by employing the second-order polynomial approximation. "Due to their dot-product form, linear kernel SVMs are able to be transformed into a compact form by exchanging summation in classification formula, leading to extremely low O(d) complexity in terms of both time and memory."
- approxsvm: Approximating nonlinear SVM models with RBF kernel. C++ implementation of a second-order Maclaurin series approximation of LIBSVM models using an RBF kernel.
- An approximation of the Gaussian RBF kernel for efficient classification with SVMs. About 18-fold speedup on average, mostly without losing classification accuracy
- Classification Using Intersection Kernel Support Vector Machines is efficient. Intersection Kernel SVM (IKSVM). Constant runtime and space requirements by precomputing auxiliary tables. Uses linear-piecewise approximation.
- Locally Linear Support Vector Machines. LL-SVM. Using manifold learning / local codings, with Stocastic Gradient Decent. Approaches classification perfomance of RBF SVM, at 1/10 to 1/100 the execution time. Still 10-50xslower than linear SVM.
- Support vector machines with piecewise linear feature mapping. 2013. PWL-SVM. Mapping the feature space into M piecewise linear sections, where M is a hyperparameter. Typical M values are 10-50. Learns the anchor points and parameters of the linear sections. Can then pass the data to regular SVM or Least Squares LS-SWM. Reaches performance similar to RBF SVM. More compact coefficient representation when number of support vectors > number of features. Prediction only requires adding, multiplication, and maximum operations. 10-100x faster to train than other non-linear methods. ! No prediction times given >(
- Feed-Forward Support Vector Machine Without Multipliers Fixed-point arithmetic, using only shift and add operations. Maintains good classification performance respect to the conventional Gaussian kernel.