Learn Before
Comparing Sequence Probabilities in Log Space
A language model assigns the following conditional probabilities to two short sentences:
- Sentence A: P("I" |
) = 0.1, P("am" | "I") = 0.5, P("happy" | "am") = 0.2 - Sentence B: P("You" |
) = 0.2, P("are" | "You") = 0.3, P("sad" | "are") = 0.3
Using the natural logarithm (ln), calculate the total log probability for each sentence. Based on your calculations, which sentence is more probable according to the model? Show your work.
0
1
Tags
Data Science
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is designed to calculate the probability of a long sentence by sequentially multiplying the conditional probabilities of each word. Each individual word probability is a small floating-point number (e.g., 0.05, 0.1, 0.02). During testing on sentences with over 100 words, the model consistently outputs a final probability of 0.0, even though no single word has a probability of zero. What is the most likely technical reason for this incorrect result?
Comparing Sequence Probabilities in Log Space
Evaluating Sequence Likelihood with Log Probabilities
Logarithmic Form of the Chain Rule for Sequence Probability
Derivation of Sequence Log-Probability via Chain Rule
Sequence Evaluation using Log-Probability