Learn Before
Language Model Debugging Scenario
Based on the principles of calculating the probability of a sequence, explain why the single error described in the case study prevents the model from generating any output at all.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model calculates the joint probability of a sequence of events (e.g., words) by multiplying the probability of the first event by the conditional probabilities of each subsequent event. Given the following probabilities for a three-event sequence (x₀, x₁, x₂), what is the joint probability of the entire sequence?
- Probability of the first event, Pr(x₀) = 0.0
- Probability of the second event given the first, Pr(x₁|x₀) = 0.4
- Probability of the third event given the first two, Pr(x₂|x₀, x₁) = 0.8
Language Model Debugging Scenario
A language model is generating a sequence of words. The first word has a probability of 0. However, the conditional probabilities for all subsequent words in the sequence are very high (e.g., 0.99 for each). In this scenario, the high probabilities of the later words can overcome the initial zero probability, resulting in a non-zero joint probability for the entire sequence.
Explaining Zero Probability Sequences