Formula

Probability Renormalization Formula for Top-k Sampling

In top-k sampling, after identifying the pool of the k most probable tokens (Vi\overline{V}_i), their probabilities are renormalized to form a new distribution that sums to 1. The renormalized probability of a token yiy_i from this pool is calculated by dividing its original probability by the sum of the original probabilities of all tokens in the pool: Pr(yix,y<i)=Pr(yix,y<i)yjViPr(yjx,y<i)\overline{\text{Pr}}(y_i|\mathbf{x}, \mathbf{y}_{<i}) = \frac{\text{Pr}(y_i|\mathbf{x}, \mathbf{y}_{<i})}{\sum_{y_j \in \overline{V}_i} \text{Pr}(y_j|\mathbf{x}, \mathbf{y}_{<i})} This ensures that the new probabilities for the tokens in Vi\overline{V}_i sum to 1.

Image 0

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences