Learn Before
Log probabilities
By using log probabilities instead of raw probabilities, we get numbers that are not as small. The result of doing all computation and storage in log space is that we only need to convert back into probabilities if we need to report them at the end; then we can just take the exp of the logprob:
0
1
Contributors are:
Who are from:
Tags
Data Science
Foundations of Large Language Models Course
Computing Sciences
Learn After
A language model is designed to calculate the probability of a long sentence by sequentially multiplying the conditional probabilities of each word. Each individual word probability is a small floating-point number (e.g., 0.05, 0.1, 0.02). During testing on sentences with over 100 words, the model consistently outputs a final probability of 0.0, even though no single word has a probability of zero. What is the most likely technical reason for this incorrect result?
Comparing Sequence Probabilities in Log Space
Evaluating Sequence Likelihood with Log Probabilities