Learn Before
Formula

Most Likely Sequence in Sequence-to-Sequence Models

In sequence generation tasks, the ultimate objective is typically to find the single most likely output sequence, which can differ from a sequence formed by simply selecting the most likely token at each individual step. If a sequence-to-sequence decoder accurately reflects the underlying generative process, the most likely translation is the complete sequence that maximizes the product of conditional probabilities over all time steps: t=1TP(yty1,,yt1,c)\prod_{t'=1}^{T'} P(y_{t'} \mid y_1, \ldots, y_{t'-1}, \mathbf{c}) where c\mathbf{c} is the context vector. This expression represents the global optimal sequence based on the model's learned probability distribution.

0

1

Updated 2026-05-14

Contributors are:

Who are from:

Tags

D2L

Dive into Deep Learning @ D2L