Learn Before
Concept

Objective Function Change Bounds in Gradient Descent

Assuming a sufficiently smooth objective function ff is Lipschitz continuous with constant LL (meaning that for any x\mathbf{x} and y\mathbf{y}, the objective satisfies f(x)f(y)Lxy|f(\mathbf{x}) - f(\mathbf{y})| \leq L \|\mathbf{x} - \mathbf{y}\|), the change in the objective value after a gradient descent update xxηg\mathbf{x} \gets \mathbf{x} - \eta \mathbf{g} is bounded by the inequality f(x)f(xηg)Lηg|f(\mathbf{x}) - f(\mathbf{x} - \eta\mathbf{g})| \leq L \eta\|\mathbf{g}\|. This bound demonstrates that the maximum change in the loss during a single step is constrained by the learning rate η\eta, the gradient norm g\|\mathbf{g}\| , and the Lipschitz constant LL. A small value for this upper bound presents a trade-off: it limits the speed at which the objective value can be reduced, but it advantageously limits how much progress can go wrong or be undone in any single gradient step.

0

1

Updated 2026-05-15

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L