Formula

Temperature-Scaled Softmax for Renormalized Probability

To control the randomness in token selection, the probability distribution can be reshaped using a temperature parameter, β\beta. The renormalized conditional probability of a token yiy_i, given the context (x,y<i)(\mathbf{x}, \mathbf{y}_{<i}), is calculated by applying a temperature-scaled Softmax function to its logit, uyiu_{y_i}, and normalizing over a restricted set of candidate tokens Vi\overline{V}_i. The formula is: Pr(yix,y<i)=exp(uyi/β)yjViexp(uyj/β)\overline{\text{Pr}}(y_i|\mathbf{x}, \mathbf{y}_{<i}) = \frac{\exp(u_{y_i}/\beta)}{\sum_{y_j \in \overline{V}_i} \exp(u_{y_j}/\beta)}

Image 0

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences