Scaling Issues in Statistical Language Models
Imagine you have built a language model that predicts the next word based on the two preceding words. It works reasonably well with a small vocabulary of 1,000 words. Explain why simply increasing the vocabulary to 100,000 words and extending the context to the four preceding words would likely cause a drastic decrease in the model's ability to handle new, unseen sentences, even with a much larger training text. In your explanation, describe the nature of the data representation problem that arises.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Neural Language Models (NLMs)
A data scientist is building a language model to predict the next word in a sequence. The model estimates the probability of a word based on the four words that precede it, using counts from a massive text corpus. Despite the large training dataset, the model performs poorly on new sentences, frequently assigning a probability of zero to perfectly plausible word sequences. Which of the following statements best analyzes the fundamental reason for this failure?
Scaling Issues in Statistical Language Models
Diagnosing a Failing Autocomplete System