Code

Seq2SeqAttentionDecoder Implementation

The Seq2SeqAttentionDecoder class implements a concrete RNN decoder that integrates Bahdanau-style additive attention into the sequence-to-sequence framework. Its architecture consists of four components: an AdditiveAttention module, an embedding layer that maps target token indices to dense vectors, a multilayer GRU whose input size equals embed_size + num_hiddens (to accommodate the concatenated context and embedding), and a fully connected output layer that projects hidden states to the target vocabulary size. All parameters are initialized using Xavier initialization.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L