Skip to content

JSAT vs LIBLINEAR & LIBSVM

EdwardRaff edited this page Nov 1, 2015 · 1 revision

JSAT vs LIBSVM

LIBSVM is the canonical SVM library, and has many ports to almost every language. Both JSAT and LIBSVM have an SMO based algorithm for obtaining the exact SVM solution, however LIBSVM's will be faster since it uses 32bit floats instead of 64bit doubles, and the more advanced working set algorithms in LIBSVM have not been implemented in JSAT.

For doing a parameter Search, JSAT may be faster - as it does not require reloading the data like LIBSVM's python script, and JSAT has the more efficient RandomSearch in addition to GridSearch.

Adding a new Kernel for use in JSAT does not require modifying JSAT's source code.

LIBSVM has more SVM types than JSAT (nu and one-class) as well as a direct multi-class SVM, where JSAT requires the use of a meta-classifier (like OneVsOne) for multi-class exact SVM.

Unlike LIBSVM, JSAT supports many non-exact approximation algorithms for SVMs. These can be just as or nearly as accurate as the exact SVM while being orders of magnitude faster. They are especially good for doing a parameter search - and can save weeks of time compared to using LIBSVM for the gridsearch.

JSAT vs LIBLINEAR

LIBLINEAR implements many efficent exact algorithms for linear problems. JSAT implements many of the same algorithms used in LIBLINEAR, such as DCDs (SVM classification / regression), LogisticRegresionDCDs (L2 regularized Logistic Regression) and NewGLMNET (Elastic Net regularized Logistic Regression). The NewGLMNET in JSAT has been extended to support Elastic Net regularization, where the implementation in LIBLINEAR only performs L1 regularization (a special case of Elastic Net).

Because JSAT's implementations are based on the detailed papers from the LIBLINEAR team, the performance is very similar. The LIBLINEAR implementation uses less memory since it works on 32bit floats instead of 64bit doubles, which can also make the LIBLINEAR implementation somewhat faster for some problems. For datasets with more than 2^31 rows, JSAT won't work due to limitations in Java - but LIBLINEAR should still work.

JSAT supports non-exact linear solvers that can be much faster, especially for very large datasets. JSAT also implements more generic batch optimizers that can be used to implement (slower versions) the same algorithms in LIBLINEAR, but also allow the flexibility to be used for a wider class of problems if desired.