Learn Before
Concept

Predicted Token Feedback in Decoder Training

As an alternative to teacher forcing, sequence-to-sequence decoders can be trained by feeding the model's own predicted token from the previous time step as the current input. This approach aligns the training process more closely with how the model generates sequences autoregressively during inference.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L