Learn Before
Formula
Optimal Step Size according to Taylor Series Approximation
Denote the function as , is the gradient and is is the Hessian at . We calculate the new point . We can obtian that According to the above equation, the optimal step size when is positive is
0
1
Updated 2026-05-15
Contributors are:
Who are from:
Tags
Data Science
D2L
Dive into Deep Learning @ D2L
Related
On a straight line, the function's derivative...
Gradient Descent
A crash course of derivatives
Second Derivative
Hessian Matrix
Lipschitz Continuous
Differentiation Rules
Derivatives of Common Functions
Chain Rule for Single-Variable Functions
Jacobian Matrix
Partial Derivative
Gradient of a Scalar-Valued Function with Respect to a Vector
Optimal Step Size according to Taylor Series Approximation