Learn Before
Concept
Gradient Descent with Momentum Pseudocode
On iteration t: Compute dW, db on the current mini-batch Note that now we have two parameters and .
0
1
Updated 2021-03-19
Tags
Data Science
Related
Intuition behind Gradient Descent with Momentum
These plots were generated with gradient descent; with gradient descent with momentum (β = 0.5) and gradient descent with momentum (β = 0.9). Which curve corresponds to which algorithm?
Gradient Descent with Momentum Pseudocode
Adam (Deep Learning Optimization Algorithm)