Learn Before
Concept

Impact of Exploding Gradients on Model Training

In neural network optimization, exploding gradients occur when the gradient norm g\|\mathbf{g}\| becomes excessively large. Because the parameter update is proportional to the gradient, a single step with an exploded gradient can cause massive changes to the model, potentially undoing the progress of thousands of previous iterations. This results in training that either diverges completely, failing to minimize the objective function, or exhibits extreme instability with massive spikes in the loss function.

0

1

Updated 2026-05-14

Tags

Data Science

D2L

Dive into Deep Learning @ D2L