Learn Before
Formula

Argmax Formula for Next Token Prediction

In the task of next token prediction, a language model determines the most likely subsequent token, x^i\hat{x}_i, given a preceding context x0,...,xi1x_0,...,x_{i-1}. This is achieved by selecting the token from the entire vocabulary V\mathcal{V} that maximizes the conditional probability output by the model. This selection process is formally expressed as:

x^i=arg maxxiVPr(xix0,...,xi1)\hat{x}_i = \argmax_{x_i \in \mathcal{V}} \Pr(x_{i}|x_0,...,x_{i-1})

Image 0

0

1

Updated 2026-04-18

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences