Learn Before
A language model calculates the probability of a sequence of three tokens, {x₀, x₁, x₂}, using the formula: Pr(x₀, x₁, x₂) = Pr(x₀) * Pr(x₁|x₀) * Pr(x₂|x₀, x₁). If the model determines that the initial token, x₀, is an impossible event, what is the joint probability of the entire sequence?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Implication of an Impossible Initial Event
A language model calculates the probability of a sequence of three tokens, {x₀, x₁, x₂}, using the formula: Pr(x₀, x₁, x₂) = Pr(x₀) * Pr(x₁|x₀) * Pr(x₂|x₀, x₁). If the model determines that the initial token, x₀, is an impossible event, what is the joint probability of the entire sequence?
Consequence of an Impossible Starting Token
A language model is calculating the probability of the sequence 'Zxq#w the cat sat'. If the model's vocabulary does not contain the token 'Zxq#w', making its initial probability zero, the model can still assign a non-zero probability to the entire sequence by considering the high probabilities of the subsequent words 'the', 'cat', and 'sat'.