1Cademy - Step-by-Step Example of Auto-Regressive Sequence Generation

Learn Before

Example

Step-by-Step Example of Auto-Regressive Sequence Generation

An auto-regressive language model generates text one token at a time, where each new token is predicted based on the sequence of tokens that came before it. The overall probability of the generated sequence is calculated by multiplying the conditional probabilities of each token. The following table illustrates this process for generating the three tokens $b$ , $c$ , and $d$ given the prefix $\langle s \rangle\ a$ :

Context	Predicted Token	Decision Rule	Cumulative Sequence Probability
$\langle s \rangle\ a$	$b$	$$\argmax_{x_2 \in V} \Pr(x_{2}	\langle s \rangle\ a)$$
$\langle s \rangle\ a\ b$	$c$	$$\argmax_{x_3 \in V} \Pr(x_{3}	\langle s \rangle\ a\ b)$$
$\langle s \rangle\ a\ b\ c$	$d$	$$\argmax_{x_4 \in V} \Pr(x_{4}	\langle s \rangle\ a\ b\ c)$$

At each step, the model selects a token $x_i$ from the vocabulary $V$ so that the conditional probability $\Pr(x_{i}|x_0,...,x_{i-1})$ is maximized. This predicted token is then appended to the end of the context sequence for the next step.

Updated 2026-05-02

Contributors are:

Who are from:

References

Learn Before

Related

Learn After