A language model is designed to calculate the likelihood of a text sequence by predicting each token based only on the tokens that have come before it. Given the three-token sequence 'The quick brown', which of the following expressions correctly represents how this model would calculate the total probability of the entire sequence?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Probability Factorization for Arbitrary Order Token Prediction
Step-by-Step Example of Auto-Regressive Sequence Generation
Standard Auto-Regressive Probability Factorization using Embeddings
A language model is designed to calculate the likelihood of a text sequence by predicting each token based only on the tokens that have come before it. Given the three-token sequence 'The quick brown', which of the following expressions correctly represents how this model would calculate the total probability of the entire sequence?
Example of Auto-Regressive Probability Calculation
Calculating Sequence Probability in an Auto-regressive Model
Debugging a Sequence Probability Calculation