A data scientist is building a language model to predict the next word in a sequence. The model estimates the probability of a word based on the four words that precede it, using counts from a massive text corpus. Despite the large training dataset, the model performs poorly on new sentences, frequently assigning a probability of zero to perfectly plausible word sequences. Which of the following statements best analyzes the fundamental reason for this failure?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Neural Language Models (NLMs)
A data scientist is building a language model to predict the next word in a sequence. The model estimates the probability of a word based on the four words that precede it, using counts from a massive text corpus. Despite the large training dataset, the model performs poorly on new sentences, frequently assigning a probability of zero to perfectly plausible word sequences. Which of the following statements best analyzes the fundamental reason for this failure?
Scaling Issues in Statistical Language Models
Diagnosing a Failing Autocomplete System