Learn Before
Concept
Applicability of Second-Order Methods in Deep Learning
While second-order optimization algorithms, such as Newton's Method, offer the theoretical advantage of using curvature to determine step sizes, they are generally impractical for deep neural networks. The primary limitation is the prohibitive computational cost associated with the Hessian matrix, . For a model with parameters, the Hessian requires storing entries, and computing it via backpropagation is excessively expensive, making the direct application of pure second-order methods infeasible for large-scale deep learning tasks.
0
1
Updated 2026-05-15
Tags
D2L
Dive into Deep Learning @ D2L