Xavier Initialization
Xavier initialization, named after its creators Glorot and Bengio (2010), is a standard technique designed to mitigate vanishing and exploding gradients by carefully setting the initial weights of a neural network layer. To balance the variance during both forward and backward propagation, it typically samples weights from a Gaussian distribution with a mean of and a variance of , where and represent the number of inputs and outputs of the layer respectively. While the underlying assumption of linear activations is often violated in practice, this initialization method has proven highly effective.
0
2
Contributors are:
Who are from:
Tags
Data Science
D2L
Dive into Deep Learning @ D2L
Related
Example of Weight Initialization
Vanishing/exploding gradient
Symmetry Breaking in Deep Learning
How to Initialization Weights to Prevent Vanishing/Exploding Gradients
Transfer Learning in Deep Learning
Multi-task Learning in Deep Learning
Variance of Layer Output in Forward Propagation
Default Random Initialization
Xavier Initialization
Xavier Initialization