Learn Before
Example

Multivariate Gradient Descent on a Two-Dimensional Quadratic

To demonstrate multivariate gradient descent in practice, consider the two-dimensional objective function f(x)=x12+2x22f(\mathbf{x}) = x_1^2 + 2x_2^2 with input x=[x1,x2]\mathbf{x} = [x_1, x_2]^\top. Its gradient is f(x)=[2x1,4x2]\nabla f(\mathbf{x}) = [2x_1, 4x_2]^\top. Starting from the initial point [5,2][-5, -2] and applying the update rule xxηf(x)\mathbf{x} \leftarrow \mathbf{x} - \eta \nabla f(\mathbf{x}) with a learning rate of η=0.1\eta = 0.1 for 2020 iterations, the trajectory of x\mathbf{x} converges steadily toward the minimum at [0,0][0, 0]. After 2020 steps the parameters reach approximately x10.057646x_1 \approx -0.057646 and x20.000073x_2 \approx -0.000073, confirming well-behaved but relatively slow convergence. The contour plot of ff shows elliptical level curves (elongated along x1x_1) because the curvature in the x2x_2 direction is twice that in x1x_1, which causes x2x_2 to converge faster than x1x_1.

Image 0

0

1

Updated 2026-05-15

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L