Concept

Conjugate Gradient Method

The conjugate gradient method is an optimization algorithm that is typically faster than the method of steepest descent, and it avoids the calculation of the inverse Hessian matrix required by Newton's method. Instead of undoing direction search progress made previously and recalculating each step, the conjugate gradient method looks for a search direction that is conjugate to the previous line search direction. At iteration tt, the next search direction dtd_t is:

dt=θf(θt)+βtdt1d_t = \nabla _\theta f(\theta_t) + \beta _t d_{t-1}

where βt\beta _t is a coefficient that controls the direction. Two popular ways to calculate βt\beta _t are the Fletcher-Reeves formula:

βt=θf(θt)θf(θt)θf(θt1)θf(θt1)\beta _t = \frac{\nabla _\theta f(\theta _t)^\top \nabla _\theta f(\theta _t)}{\nabla _\theta f(\theta _{t-1})^\top \nabla _\theta f(\theta _{t-1})}

and the Polak-Ribière formula:

βt=(θf(θt)θf(θt1))θf(θt)θf(θt1)θf(θt1)\beta _t = \frac{(\nabla _\theta f(\theta _t) - \nabla_\theta f(\theta _{t-1}))^\top \nabla _\theta f(\theta _t)}{\nabla _\theta f(\theta _{t-1})^\top \nabla _\theta f(\theta _{t-1})}

0

1

Updated 2026-06-14

References


Tags

Data Science