Learn Before
Example
Seq2SeqEncoder Output Shapes
Given a minibatch of sequence inputs with batch size and number of time steps , a two-layer GRU encoder with hidden units produces two tensors. The enc_outputs tensor has the shape (num_steps, batch_size, num_hiddens) , representing the top-layer hidden states at every time step. The enc_state tensor has the shape (num_layers, batch_size, num_hiddens) , containing the multilayer hidden states at the final time step only. Since GRUs use a single hidden state vector (unlike LSTMs, which also maintain a separate memory cell), the state tensor has exactly three dimensions.
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L