Learn Before
A team is developing a language model to predict the next word in a sentence. They find that their model assigns a probability of zero to the phrase 'the innovative chef prepares...' because it has never seen the specific two-word sequence 'innovative chef' in its training data, despite having seen 'innovative ideas' and 'master chef' many times. Which characteristic of a neural network-based approach to language modeling is specifically designed to overcome this type of generalization failure?
0
1
Tags
Data Science
Deep Learning (in Machine learning)
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Large Language Models (LLMs)
BERT (Bidirectional Encoder Representations from Transformers)
Bengio et al. (2003) Feed-Forward Neural Language Model
A team is developing a language model to predict the next word in a sentence. They find that their model assigns a probability of zero to the phrase 'the innovative chef prepares...' because it has never seen the specific two-word sequence 'innovative chef' in its training data, despite having seen 'innovative ideas' and 'master chef' many times. Which characteristic of a neural network-based approach to language modeling is specifically designed to overcome this type of generalization failure?
NLM Advantage Over Traditional Models
Language Model Generalization