2.5 Code tour: 2D convolution backward pass

Here are the other convolution resources I mention in the video:


In-depth one-dimensional convolution math and blog post:

https://e2eml.school/convolution_one_d.html


In-depth one-dimensional convolution video series:

https://youtu.be/4ERudRAxyGE?list=PLVZqlMpoM6ka9uPzaSCpg75AeS7wPpjl9


Video on How Convolution Works:

https://youtu.be/B-M5q51U8SM


Course 321, which includes the derivation, implementation, and application of one-dimensional neural networks:

https://end-to-end-machine-learning.teachable.com/p/321-convolutional-neural-networks


A vintage (2016) convolutional neural network walkthrough video:

https://youtu.be/FmpDIaiMIeA



Postscript

After I published this course I found a bug in the computation of the weight gradients. The details of the bug are less important than the implications. Learning by building is a creative endeavor and by its nature involves a lot of trial and error. Due to their complexity, machine learning algorithms make excellent hiding places for bugs. So even though the code ran and appeared to perform well, it wasn’t as healthy as I believed. I’m leaving the original walkthrough video in place with the link to the updated code here, in an effort to model transparency and public error correction.

If you are curious, the bug was in how I implemented the cross correlation between the output gradient and the inputs in the convolution layer to get the weight gradient (around 1:55 in the video above). Unlike convolution, the order of the arguments matters in cross correlation. I got them in the wrong order, and as a result the calculated weight gradient was reversed. I also needed to add some padding, as in the calculation for the input gradient, in order to get everything to work out right. This bug is fairly low level and doesn’t obscure the concepts involved, so I am less worried about it from a teaching point of view.

Still, it’s pretty embarrassing. My apologies if it tripped you up in any way.

Discussion

0 comments