Learn Before
Seq2SeqEncoder Implementation
The Seq2SeqEncoder class implements the RNN-based encoder for sequence-to-sequence learning by extending a base Encoder interface. Its architecture consists of two primary components: an embedding layer that converts each input token index into a dense feature vector, and a multilayer GRU that processes the resulting sequence of embeddings. The embedding layer's weight matrix has a shape of (vocab_size, embed_size), where each row stores the feature vector for the token with index . During the forward pass, the input tensor of shape (batch_size, num_steps) is first transposed and embedded to produce a tensor of shape (num_steps, batch_size, embed_size). The GRU then processes this sequence and returns two outputs: outputs of shape (num_steps, batch_size, num_hiddens), containing the final-layer hidden states at every time step, and state of shape (num_layers, batch_size, num_hiddens), containing the hidden states of all layers at the final time step. All weights are initialized using Xavier initialization.
0
1
Tags
D2L
Dive into Deep Learning @ D2L