1Cademy - Softmax Renormalization in Top-k Sampling

Learn Before

Top-k Sampling

Formula

Softmax Renormalization in Top-k Sampling

In top-k sampling, after the candidate pool $\overline{V}_i$ is determined, the probability distribution over this restricted set can be calculated using the Softmax function applied to the token logits. If $u_{y_i}$ represents the logit for token $y_i$ , the rescaled probability $\overline{\Pr}(y_i|\mathbf{x},\mathbf{y}_{<i})$ is given by: $\overline{\Pr}(y_i|\mathbf{x},\mathbf{y}_{<i}) = \frac{\exp(u_{y_i})}{\sum_{y_j \in \overline{V}_i} \exp(u_{y_j})}$

Updated 2026-05-05

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related