Learn Before
Concept
Dropout Regularization for Symmetry Breaking
When a neural network's hidden layer parameters are initialized to a constant value, standard gradient-based iteration algorithms, such as minibatch stochastic gradient descent, update the parameters uniformly and cannot break the resulting parameter symmetry on their own. However, applying dropout regularization is capable of breaking this symmetry, allowing the network to overcome the limitations of uniform weights and eventually realize its full expressive power.
0
1
Updated 2026-05-06
Tags
D2L
Dive into Deep Learning @ D2L