Formula for Token Sampling in Autoregressive Models
In autoregressive models, the selection of the next token, , is formally represented as drawing a sample from the model's conditional probability distribution. This is expressed by the formula: This notation signifies that the token is sampled from the probability distribution over all possible tokens , conditioned on the input context and the sequence of previously generated tokens . The context of preceding tokens, , is sometimes written more compactly as .

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Formula for Token Sampling in Autoregressive Models
Applying Token Sampling in Text Generation
An autoregressive language model has processed the input sequence 'The cat sat on the' and has calculated the following conditional probability distribution for the next token: P('mat'|context) = 0.6, P('rug'|context) = 0.3, P('floor'|context) = 0.08, P('sky'|context) = 0.02. If the model then samples a token from this distribution, which of the following statements is most accurate?
In autoregressive text generation, after the model computes the conditional probability distribution for the next token, the sampling process always selects the token with the highest probability score.
Learn After
Sequence Extension with a Sampled Token
An autoregressive language model has generated the sequence of tokens: 'The quick brown fox'. It is now about to generate the next token. Which expression accurately describes how the model will select this next token?
An autoregressive model selects the next token by sampling from a conditional probability distribution, represented by the formula: Match each component of this formula to its correct description.
Explaining Model Output Variability