Learn Before
Velocity Initialization in Momentum Method
When implementing the momentum method in optimization, the velocity vector accumulates past gradients to update the parameters. At the beginning of the optimization process, specifically at time , this velocity vector is conveniently initialized to zero, denoted as .
0
1
Tags
D2L
Dive into Deep Learning @ D2L
Related
Intuition behind Gradient Descent with Momentum
These plots were generated with gradient descent; with gradient descent with momentum (β = 0.5) and gradient descent with momentum (β = 0.9). Which curve corresponds to which algorithm?
Adam (Deep Learning Optimization Algorithm)
Origin of the Momentum Method
Velocity Initialization in Momentum Method
Momentum Convergence on a Scalar Quadratic
Gradient Descent with Momentum Pseudocode