Learn Before
When a language model evaluates different possible output sequences, why is it standard practice to sum their log-probabilities instead of multiplying their raw probabilities?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Incremental Calculation of Sequence Log-Probability
Example of Autoregressive Generation and Log-Probability Calculation
A language model is generating a continuation for the input 'The best way to learn a new skill is'. It has produced two candidate sequences and calculated their total log-probabilities as follows:
- Sequence A: '...by practicing consistently.' (Total log-probability = -1.15)
- Sequence B: '...through osmotic absorption.' (Total log-probability = -7.82)
Based on these values, which sequence is considered more plausible by the model, and why?
When a language model evaluates different possible output sequences, why is it standard practice to sum their log-probabilities instead of multiplying their raw probabilities?
A language model has generated the sequence 'The sun is' with a cumulative log-probability of -2.5. The model is now considering the next token. Given the following conditional log-probabilities for the next token, which choice would result in the most probable three-word sequence?