1Cademy - When a language model evaluates different possible output sequences, why is it standard practice to sum their log-probabilities instead of multiplying their raw probabilities?

Learn Before

Sequence Evaluation using Log-Probability

Multiple Choice

When a language model evaluates different possible output sequences, why is it standard practice to sum their log-probabilities instead of multiplying their raw probabilities?

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Incremental Calculation of Sequence Log-Probability
Example of Autoregressive Generation and Log-Probability Calculation
A language model is generating a continuation for the input 'The best way to learn a new skill is'. It has produced two candidate sequences and calculated their total log-probabilities as follows:
- Sequence A: '...by practicing consistently.' (Total log-probability = -1.15)
- Sequence B: '...through osmotic absorption.' (Total log-probability = -7.82)
Based on these values, which sequence is considered more plausible by the model, and why?
When a language model evaluates different possible output sequences, why is it standard practice to sum their log-probabilities instead of multiplying their raw probabilities?
A language model has generated the sequence 'The sun is' with a cumulative log-probability of -2.5. The model is now considering the next token. Given the following conditional log-probabilities for the next token, which choice would result in the most probable three-word sequence?

Learn Before

Related