Learn Before
Concept
GRU Parameters Initialization
In a Gated Recurrent Unit (GRU) network, the learnable model parameters encompass weight matrices and bias vectors for the update gate, the reset gate, and the candidate hidden state. The dimensionality of these parameters is dictated by the input size and the hyperparameter defining the number of hidden units. A standard initialization strategy involves drawing all weight values from a Gaussian distribution with a specified standard deviation, while initializing all bias values exactly to .
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L