1Cademy - Iterative Application of Argmax for Next Token Prediction

Learn Before

Argmax Formula for Next Token Prediction

Example

Iterative Application of Argmax for Next Token Prediction

The argmax function is applied iteratively to select the most probable next token at each step of sequence generation. For a sequence beginning with the prefix $\langle s \rangle\ a$ , the model first predicts the token $x_2$ by maximizing the conditional probability given $\langle s \rangle\ a$ . It then uses this new context to predict $x_3$ , and so on. This step-by-step process is illustrated by the following sequence of operations:

Predict the second token: $\argmax_{x_2 \in V} \Pr(x_{2}|\langle s \rangle\ a)$
Predict the third token: $\argmax_{x_3 \in V} \Pr(x_{3}|\langle s \rangle\ a\ b)$
Predict the fourth token: $\argmax_{x_4 \in V} \Pr(x_{4}|\langle s \rangle\ a\ b\ c)$

This iterative selection, where each new token is chosen by maximizing its conditional probability based on the preceding context, is a core mechanism of greedy decoding in autoregressive models.

0

1

Updated 2026-04-18

Contributors are: