If playback doesn't begin shortly, try restarting your device.
•
You're signed out
Videos you watch may be added to the TV's watch history and influence TV recommendations. To avoid this, cancel and sign in to YouTube on your computer.
CancelConfirm
Share
An error occurred while retrieving sharing information. Please try again later.
We take the 2-layer MLP (with BatchNorm) from the previous video and backpropagate through it manually without using PyTorch autograd's loss.backward(): through the cross entropy loss, 2nd linear layer, tanh, batchnorm, 1st linear layer, and the embedding table. Along the way, we get a strong intuitive understanding about how gradients flow backwards through the compute graph and on the level of efficient Tensors, not just individual scalars like in micrograd. This helps build competence and intuition around how neural nets are optimized and sets you up to more confidently innovate on and debug modern neural networks.
!!!!!!!!!!!!
I recommend you work through the exercise yourself but work with it in tandem and whenever you are stuck unpause the video and see me give away the answer. This video is not super intended to be simply watched. The exercise is here:
https://colab.research.google.com/dri...
!!!!!!!!!!!!
Links:
…...more
Building makemore Part 4: Becoming a Backprop Ninja
4.5KLikes
260,102Views
2022Oct 11
We take the 2-layer MLP (with BatchNorm) from the previous video and backpropagate through it manually without using PyTorch autograd's loss.backward(): through the cross entropy loss, 2nd linear layer, tanh, batchnorm, 1st linear layer, and the embedding table. Along the way, we get a strong intuitive understanding about how gradients flow backwards through the compute graph and on the level of efficient Tensors, not just individual scalars like in micrograd. This helps build competence and intuition around how neural nets are optimized and sets you up to more confidently innovate on and debug modern neural networks.
!!!!!!!!!!!!
I recommend you work through the exercise yourself but work with it in tandem and whenever you are stuck unpause the video and see me give away the answer. This video is not super intended to be simply watched. The exercise is here:
https://colab.research.google.com/dri...
!!!!!!!!!!!!
Links:
Chapters:
00:00:00 intro: why you should care & fun history
00:07:26 starter code
00:13:01 exercise 1: backproping the atomic compute graph
01:05:17 brief digression: bessel’s correction in batchnorm
01:26:31 exercise 2: cross entropy loss backward pass
01:36:37 exercise 3: batch norm layer backward pass
01:50:02 exercise 4: putting it all together
01:54:24 outro…...more