1Cademy - Effect of Temperature on Token Generation

Learn Before

Temperature-Scaled Softmax for Token Probability

Case Study

Effect of Temperature on Token Generation

A language model is trying to complete the sentence 'The cat sat on the ___.' It has calculated the following output scores for potential next words: {'mat': 4.0, 'rug': 3.5, 'throne': 1.0, 'car': -2.0}. The model's output probabilities are determined by the formula: $Pr(word) = \frac{\exp(score / \beta)}{\sum \exp(scores / \beta)}$ , where $\beta$ is a temperature parameter. Consider two scenarios for generating the next word: Scenario A with $\beta = 0.5$ and Scenario B with $\beta = 1.5$ . In which scenario is the model more likely to generate the word 'throne'? Justify your answer by explaining the role of the temperature parameter $\beta$ in shaping the final probability distribution.

0

1

Updated 2025-10-07

Contributors are:

Who are from:

Learn Before

Related