Learn Before
Reparametrization to Mitigate Stalling Optimization
When optimization in deep neural networks stalls—frequently a consequence of vanishing gradients—a common mitigation strategy is reparametrization. This approach involves altering the mathematical formulation of the problem to create a more favorable loss landscape, thereby allowing optimization algorithms to resume making progress.
0
1
Tags
D2L
Dive into Deep Learning @ D2L
Related
Solutions for vanishing/exploding gradient
A Gentle Introduction to Exploding Gradients in Neural Networks
Zero Weight Initialization in Feed-Forward Networks
Impact of Exploding Gradients on Model Training
Vanishing Gradient of the Tanh Activation Function
Reparametrization to Mitigate Stalling Optimization
Mathematical Mechanism of Vanishing and Exploding Gradients in Recurrent Neural Networks