1Cademy - Conjugate Gradients

Learn Before

Second-order Optimization Algorithm

Concept

Conjugate Gradients

The Conjugate Gradients is a method faster than the method of steepest descent, and it avoids calculation of the inverse Hessian required by Newton's Method. Instead of undoing direction search progress made previously and recalculating each step, the method of conjugate gradients looks for a search direction that is conjugate to the previous line search direction.

At t iteration, the next search direction is: $d_t = \nabla _\theta f(\theta) + \beta _t d_{t-1}$ Where $\beta _t$ is a coefficient that controls the direction. Two popular ways to calculate $\beta _t$ are: Fletcher-Reeves: $\beta _t = \frac{\nabla _\theta f(\theta _t)^\top \nabla _\theta f(\theta _t)}{\nabla _\theta f(\theta _{t-1})^\top \nabla _\theta f(\theta _{t-1})}$ Polak-Ribière: $\beta _t = \frac{(\nabla _\theta f(\theta _t) - \nabla_\theta f(\theta _{t-1}))^\top \nabla _\theta f(\theta _t)}{\nabla _\theta f(\theta _{t-1})^\top \nabla _\theta f(\theta _{t-1})}$

0

1

Updated 2021-07-23

Contributors are:

Woongjin Jang

🏆 4

Who are from:

University of Michigan - Ann Arbor

🏆 4

References

Deep Learning

Learn Before

Related

Learn After