Random Forest: in addition to bagging (boot strap vertically the input matrix), random forest also perform bootstrap horizontally the input matrix. Each tree is run with some randomly chosen columns (inputs). This reduce the correlation among different tree, by that increasing the performance of the final ensemble.
Boosting: If in bagging, all the simple trees are created equal, in boosting, they are not. Boosting is done sequentially, after the first tree is fitted, the second tree is made to focus on the wrongly classified examples done by the first. And then the next tree keep focusing on the mistakes made by previous trees.
Ensemble: Both Random Forest and Gradient Boosting Machine are example of ensemble methods. In general, one can have ensemble of ensemble of ensemble (as the winning solution to the Netflix prize). For classification, an ensemble can be done as the majority vote (random forest does this). For regression, an ensemble can be done by averaging individual regressor. More sophisticated methods (but also prone to overfitting) are linear regression, ridge, lasso, Bayesian averaging. I like non negative least square.
Margin maximization (as in SVM): according to Yann LeCun, margin maximization is similar to L2 regularization.
Feature engineering: including polynomial terms (x^2, sqrt(x), log(x)), interactive term x1 x2. Wavelet transform seem to be useful in vision tasks. SVM kernel can be thought of as feature engineering.
Overall, the portfolio of (out of the box) machine learning tools are:
1. Random Forest
2. Gradient Boosting Machine (and other boosting methods i.e. AdaBoost)
3. SVM
4. Lasso / Ridge / Elastic Net / Linear / Logistic / GLM
5. Gaussian Process
6. LDA / QDA
7. NaiveBayes
8. Neural Network
5, 6, 7 are generative models. The rest are discriminative models (right?). 3 and 5 are kernel methods. 1, 2 are tree based methods.
No comments:
Post a Comment