Applying the Chain Rule to a Sequence
A language model is generating the sequence of tokens 'the', 'dog', 'barked'. Write the mathematical expression for the joint probability of this entire sequence, Pr(the, dog, barked), by decomposing it into a product of conditional probabilities.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is generating a sequence of words. Given the following conditional probabilities, what is the joint probability of the model generating the exact sequence 'The cat sat'?
- The probability of starting with 'The' is 0.4.
- The probability of 'cat' following 'The' is 0.5.
- The probability of 'sat' following 'The cat' is 0.9.
Applying the Chain Rule to a Sequence
Comparing Language Model Sequence Probabilities