Deep Learning seems to be something that is hard to understand, comparing with other machine learning methods.
Here are the basic ideas that I've got:
1. Unsupervised Pre-training (using Restricted Boltzmann Machine (RBM))
a. At the bottom layer is your input.
b. Generate the first hidden layer using a RBM. To understand what is RBM, follow Intro to RBM
c. Using the newly generated hidden layer, generate the next hidden layer in the same manner as b. Keep doing as many layers as you want.
d. Now the top hidden layer is connected with the output via logistic regression for example.
2. After all the weight are initialized, use standard back propagation to fine tune the weights. Back propagation is just the usual gradient descent, having a new name emphasizing the multiple layers (chain rule of derivative).
Instead of RBM, one can use auto-encoder, denoising auto-encoder, sparse auto-encoder. The idea of these supervised methods are similar to PCA, in that they try to find a representation of the inputs, that is hopefully more useful in predicting the output. Unlike PCA which is linear, then a new layer of PCA is still linear and is not helpful, these methods are not linear. For example in vision task, if the input (lowest layer) is the pixel intensity. The next layer might learn to find the edges (similar to Gabor wavelet), then the next layer might represent figures, then the next layer learn objects and so on.
These methods can be thought of as automatic feature engineering.
The tutorial codes at Deep Learning Tutorial is helpful.
Hey, I found your blog on trading and economics very insightful. I also understand you were a former student at LSE - Could I ask if you are willing to share some of your previous material? Namely MA208. Will that be possible? Please drop me an email dazz.sachs at gmail
ReplyDeleteThank you!