1Cademy - Comparing Sequence Probabilities in Log Space

Learn Before

Log Probabilities

Short Answer

Comparing Sequence Probabilities in Log Space

A language model assigns the following conditional probabilities to two short sentences:

Sentence A: P("I" | ) = 0.1, P("am" | "I") = 0.5, P("happy" | "am") = 0.2
Sentence B: P("You" | ) = 0.2, P("are" | "You") = 0.3, P("sad" | "are") = 0.3

Using the natural logarithm (ln), calculate the total log probability for each sentence. Based on your calculations, which sentence is more probable according to the model? Show your work.

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

A language model is designed to calculate the probability of a long sentence by sequentially multiplying the conditional probabilities of each word. Each individual word probability is a small floating-point number (e.g., 0.05, 0.1, 0.02). During testing on sentences with over 100 words, the model consistently outputs a final probability of 0.0, even though no single word has a probability of zero. What is the most likely technical reason for this incorrect result?
Comparing Sequence Probabilities in Log Space
Evaluating Sequence Likelihood with Log Probabilities
Logarithmic Form of the Chain Rule for Sequence Probability
Derivation of Sequence Log-Probability via Chain Rule
Sequence Evaluation using Log-Probability

Learn Before

Related