Learn Before
Concept

Conjugate Gradients

The Conjugate Gradients is a method faster than the method of steepest descent, and it avoids calculation of the inverse Hessian required by Newton's Method. Instead of undoing direction search progress made previously and recalculating each step, the method of conjugate gradients looks for a search direction that is conjugate to the previous line search direction.

At t iteration, the next search direction is: dt=θf(θ)+βtdt1d_t = \nabla _\theta f(\theta) + \beta _t d_{t-1} Where βt\beta _t is a coefficient that controls the direction. Two popular ways to calculate βt\beta _t are: Fletcher-Reeves: βt=θf(θt)θf(θt)θf(θt1)θf(θt1)\beta _t = \frac{\nabla _\theta f(\theta _t)^\top \nabla _\theta f(\theta _t)}{\nabla _\theta f(\theta _{t-1})^\top \nabla _\theta f(\theta _{t-1})} Polak-Ribière: βt=(θf(θt)θf(θt1))θf(θt)θf(θt1)θf(θt1)\beta _t = \frac{(\nabla _\theta f(\theta _t) - \nabla_\theta f(\theta _{t-1}))^\top \nabla _\theta f(\theta _t)}{\nabla _\theta f(\theta _{t-1})^\top \nabla _\theta f(\theta _{t-1})}

0

1

Updated 2021-07-23

References


Tags

Data Science

Related