| Previous: Learning Problems |
Classifiers
The standard supervised learning problem comprises a set of training examples {(x1,y1),...,(xm,ym)} drawn identically and independently from some unknown distribution. Each example (xi,yi) has two components: the feature vector (xi) and some label (yi). The feature vector is generally a fixed-length set of numerical features (or attributes). When the label is real-valued, the problem is called regression. When the label belongs to a discreet set, then the problem is classification. The following algorithms solve the binary classification problem where the label may have one of two values.
A decision tree builds an interpretable model that represents a set of rules. It is a popular tool for classification that is relatively fast to train and make predictions.
Features
- Requires/handles a large dataset
- Not require data preprocessing
- Handles missing data
- Avoids overfitting
- Handles contineous and normal attributes
- Several attribute selection measures
- Performs subsetting on discreet values
- Graphical Models using: graphviz

Tutorial: http://dms.irb.hr/tutorial/tut_dtrees.php
Quinlan, J. Ross. C4.5: Programs for Machine Learning: Morgan Kaufmann Publishers, 1993.
A decision tree builds an interpretable model that represents a set of rules. It is a popular tool for classification that is relatively fast to train and make predictions.
Features
- Not require data preprocessing
- Requires/handles a large dataset
- Handles missing attributes
- Supports weighted examples
- Computational efficient for discreet attributes
- Attribute selection: Misclassification, Gini, Entropy, TopDown, Least-squared
- Control over attribute reuse
- Graphical Models using: graphviz

Tutorial: http://dms.irb.hr/tutorial/tut_dtrees.php
The ADTree classifier leverages a boosting procedure to build an alternating decision tree or option tree, a generalization of the decision tree. The final decision is reached summing over each decision.
Features
- Not require data preprocessing
- Powerful classifier, intuitive graphical model
- Computational efficient for discreet attributes
- Graphical Models using: graphviz

Application: Predicting Genetic Regulatory Response
Freund, Yoav, and Llew Mason. "The Alternating Decision Tree Learning Algorithm." Paper presented at the 16th International Conference on Machine Learning, Bled, Slovenia 1999.
The support vector machines (SVM) classifer employs the kernel trick to perform non-linear classification in a linear space. The SVM implementation used in malibu is LIBSVM.
Features
- Sparse solution
- Handles noise in data
- Simple model selection

Short Tutorial: http://www.idiom.com/~zilla/Work/Notes/svmtutorial.pdf
Chang, Chih-Chung, and Chih-Jen Lin. Libsvm: A Library for Support Vector Machines May 5, 2003 2003 [cited. Available from http://www.csie.ntu.edu.tw/~cjlin.
Cortes, Corinna, and Vladimir Vapnik. "Support-Vector Networks." Machine Learning 20, no. 3 (1995): 273-97.
| Previous: Learning Problems |