Formula

Formula for Token Sampling in Autoregressive Models

In autoregressive models, the selection of the next token, yˉi\bar{y}_i, is formally represented as drawing a sample from the model's conditional probability distribution. This is expressed by the formula: yˉiPr(yix,y<i)\bar{y}_i \sim \overline{\text{Pr}}(y_i|\mathbf{x}, \mathbf{y}_{<i}) This notation signifies that the token yˉi\bar{y}_i is sampled from the probability distribution over all possible tokens yiy_i, conditioned on the input context x\mathbf{x} and the sequence of previously generated tokens y<i\mathbf{y}_{<i}. The context of preceding tokens, y<i\mathbf{y}_{<i}, is sometimes written more compactly as yi\mathbf{y}_i.

Image 0

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences