1Cademy - Probability Renormalization Formula for Restricted Vocabulary Sampling

Learn Before

Top-k Sampling Process

Formula

Probability Renormalization Formula for Restricted Vocabulary Sampling

In sampling-based decoding methods like top-k or top-p, after a restricted vocabulary $\overline{V}_i$ is selected, the probabilities of the tokens within this set are rescaled to form a new, valid probability distribution. The renormalized probability, $\overline{\text{Pr}}$ , of a token $y_i$ is calculated by dividing its original conditional probability by the sum of the original probabilities of all tokens $y_j$ in the restricted set $\overline{V}_i$ . This is expressed as: $\overline{\text{Pr}}(y_i|\mathbf{x}, \mathbf{y}_{<i}) = \frac{\text{Pr}(y_i|\mathbf{x}, \mathbf{y}_{<i})}{\sum_{y_j \in \overline{V}_i} \text{Pr}(y_j|\mathbf{x}, \mathbf{y}_{<i})}$

0

1

Updated 2025-10-08

Contributors are:

Who are from:

References

Learn Before

Related

Learn After