Learn Before
Block-Specific Parameter Initialization
Neural network parameters do not need to be initialized uniformly across an entire model. Deep learning frameworks allow practitioners to apply distinct initialization methods to specific architectural blocks or layers. For instance, one layer might use the Xavier initializer to maintain activation variance, while another layer in the same network could have its parameters initialized to a specific constant value.
0
1
Tags
D2L
Dive into Deep Learning @ D2L
Related
Example of Weight Initialization
Vanishing/exploding gradient
Symmetry Breaking in Deep Learning
Transfer Learning in Deep Learning
Multi-task Learning in Deep Learning
Variance of Layer Output in Forward Propagation
Default Random Initialization
Xavier Initialization
Built-in Gaussian Parameter Initialization
Constant Parameter Initialization
Block-Specific Parameter Initialization
Forced Parameter Reinitialization
Custom Parameter Initialization
Direct Parameter Assignment
Lazy Parameter Initialization
How to Initialize Weights to Prevent Vanishing/Exploding Gradients