Example

Seq2SeqEncoder Output Shapes

Given a minibatch of sequence inputs with batch size =4= 4 and number of time steps =9= 9, a two-layer GRU encoder with 1616 hidden units produces two tensors. The enc_outputs tensor has the shape (num_steps, batch_size, num_hiddens) =(9,4,16)= (9, 4, 16), representing the top-layer hidden states at every time step. The enc_state tensor has the shape (num_layers, batch_size, num_hiddens) =(2,4,16)= (2, 4, 16), containing the multilayer hidden states at the final time step only. Since GRUs use a single hidden state vector (unlike LSTMs, which also maintain a separate memory cell), the state tensor has exactly three dimensions.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L