1Cademy - Mathematical Mechanism of Vanishing and Exploding Gradients in Recurrent Neural Networks

Learn Before

Vanishing/exploding gradient

Concept

Mathematical Mechanism of Vanishing and Exploding Gradients in Recurrent Neural Networks

Vanishing and exploding gradients are common problems in recurrent neural networks. Consider a network where an input is multiplied by a weight matrix $\mathbf{W}$ for $t$ time steps. Let $\mathbf{W}^t$ have the eigendecomposition $\mathbf{V} \Lambda^t \mathbf{V}^{-1}$ , where $\Lambda$ is a diagonal matrix of eigenvalues. We can see that if an eigenvalue $\lambda > 1$ , the result will approach $\infty$ as $t$ gets large, leading to an exploding gradient. Conversely, if $\lambda < 1$ , the result will approach $0$ as $t$ gets large, resulting in a vanishing gradient.

Updated 2026-05-18

Contributors are:

Who are from:

University of Connecticut

🏆 4

Google

✔️ 1

References

Deep Learning

Learn Before

Related