Case Study

Effect of Temperature on Token Generation

A language model is trying to complete the sentence 'The cat sat on the ___.' It has calculated the following output scores for potential next words: {'mat': 4.0, 'rug': 3.5, 'throne': 1.0, 'car': -2.0}. The model's output probabilities are determined by the formula: Pr(word)=exp(score/β)exp(scores/β)Pr(word) = \frac{\exp(score / \beta)}{\sum \exp(scores / \beta)}, where β\beta is a temperature parameter. Consider two scenarios for generating the next word: Scenario A with β=0.5\beta = 0.5 and Scenario B with β=1.5\beta = 1.5. In which scenario is the model more likely to generate the word 'throne'? Justify your answer by explaining the role of the temperature parameter β\beta in shaping the final probability distribution.

0

1

Updated 2025-10-07

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science