Learn Before
Example

Gradient Descent on a Nonconvex Function with Local Minima

For nonconvex objective functions, gradient descent can converge to a local minimum rather than the global minimum, and the particular local minimum reached depends on both the learning rate and the problem's conditioning. As an illustration, consider the function f(x)=xcos(cx)f(x) = x \cdot \cos(cx) for a constant cc, which possesses infinitely many local minima due to its oscillatory structure. When gradient descent is applied with an unrealistically high learning rate, the algorithm takes large steps that skip over better-quality minima and settles into a poor local minimum. This demonstrates that the learning rate not only affects convergence speed but also influences which solution gradient descent ultimately finds on nonconvex landscapes.

Image 0

0

1

Updated 2026-05-15

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L