Concept

Intuition behind Nesterov algorithm

Nesterov example:

  • First lets say our parameter approached to the edge of the local or global minima
  • Then from momentum parameters would move further down.
  • Once its at the down we check the gradient of that new point (now it is almost zero so we continue to converge)

Simple momentum example:

  • First lets say our parameter approached to the edge of the local or global minima
  • Then from momentum parameters would move further down.
  • Once its at the down instead staying there we are going out of the minima because we had to use the gradient from the previous position which was very big( and we might even diverge)
Image 0

0

1

Updated 2020-11-16

Tags

Data Science