Code

Seq2SeqEncoder Implementation

The Seq2SeqEncoder class implements the RNN-based encoder for sequence-to-sequence learning by extending a base Encoder interface. Its architecture consists of two primary components: an embedding layer that converts each input token index into a dense feature vector, and a multilayer GRU that processes the resulting sequence of embeddings. The embedding layer's weight matrix has a shape of (vocab_size, embed_size), where each row ii stores the feature vector for the token with index ii. During the forward pass, the input tensor of shape (batch_size, num_steps) is first transposed and embedded to produce a tensor of shape (num_steps, batch_size, embed_size). The GRU then processes this sequence and returns two outputs: outputs of shape (num_steps, batch_size, num_hiddens), containing the final-layer hidden states at every time step, and state of shape (num_layers, batch_size, num_hiddens), containing the hidden states of all layers at the final time step. All weights are initialized using Xavier initialization.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L