Learn Before
Mathematical Mechanism of Vanishing and Exploding Gradients in Recurrent Neural Networks
Vanishing and exploding gradients are common problems in recurrent neural networks. Consider a network where an input is multiplied by a weight matrix for time steps. Let have the eigendecomposition , where is a diagonal matrix of eigenvalues. We can see that if an eigenvalue , the result will approach as gets large, leading to an exploding gradient. Conversely, if , the result will approach as gets large, resulting in a vanishing gradient.
0
2
Tags
Data Science
Related
Solutions for vanishing/exploding gradient
A Gentle Introduction to Exploding Gradients in Neural Networks
Zero Weight Initialization in Feed-Forward Networks
Impact of Exploding Gradients on Model Training
Vanishing Gradient of the Tanh Activation Function
Reparametrization to Mitigate Stalling Optimization
Mathematical Mechanism of Vanishing and Exploding Gradients in Recurrent Neural Networks