During this course we'll walk through a few popular tricks for getting neural networks to work better. We'll explain what they do and how to adjust them. Where possible, we'll talk through why they work. The major topics are:
- Regularization. Adding extra penalties to weights to keep them from getting to big and to gently encourage them to go to zero.
- Dropout. Adjusting only a few of the weights at a time to encourage them to migrate in different directions.
- Initialization. Choosing how to randomly distribute the initial values of the weights.
- Optimizers. There are lots of alternatives to straight gradient descent. We'll code up a few of the popular ones.
- Batching. Using a small batch of examples to estimate the gradient, rather than doing it separately for each example.
The other topic we'll work through is reporting, giving a concise summary of all the options used in a given model. This is a powerful (and shockingly often neglected!) to enhance the repeatability of your results.