Evaluating the 'Arg Max' Prediction Strategy
A common strategy for generating text with a probabilistic model is to always choose the single most likely output, a process formally described as . Evaluate this strategy. Discuss one significant advantage and one significant disadvantage of strictly adhering to this rule for tasks like creative writing or chatbot conversations.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.5 Inference - Foundations of Large Language Models
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Next-Word Prediction Model
A language model's prediction rule is to select the output with the highest conditional probability. Given the input text 'The ocean is deep and...', the model computes the following probabilities for the next word:
- P('mysterious' | 'The ocean is deep and...') = 0.55
- P('blue' | 'The ocean is deep and...') = 0.30
- P('empty' | 'The ocean is deep and...') = 0.10
- P('loud' | 'The ocean is deep and...') = 0.05
Based on its prediction rule, which word will the model choose?
Temperature-Scaled Softmax for Token Probability
Evaluating the 'Arg Max' Prediction Strategy