Learn Before
Concept
Alternative to the Hessian (Krylov Methods)
Sometimes, higher order derivatives are needed for our models to learn. If we needed the second order derivatives, we could use the Hessian matrix. However, there are often millions or even billions of parameters in our models, so the Hessian is extremely difficult to calculate and represent.
For some function with a Hessian , and an arbitrary vector :
0
1
Updated 2021-06-11
Tags
Data Science