Learn Before
Code
Seq2SeqAttentionDecoder Implementation
The Seq2SeqAttentionDecoder class implements a concrete RNN decoder that integrates Bahdanau-style additive attention into the sequence-to-sequence framework. Its architecture consists of four components: an AdditiveAttention module, an embedding layer that maps target token indices to dense vectors, a multilayer GRU whose input size equals embed_size + num_hiddens (to accommodate the concatenated context and embedding), and a fully connected output layer that projects hidden states to the target vocabulary size. All parameters are initialized using Xavier initialization.
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L