Learn Before
Concept
Predicted Token Feedback in Decoder Training
As an alternative to teacher forcing, sequence-to-sequence decoders can be trained by feeding the model's own predicted token from the previous time step as the current input. This approach aligns the training process more closely with how the model generates sequences autoregressively during inference.
0
1
Updated 2026-05-14
Tags
D2L
Dive into Deep Learning @ D2L