Learn Before
Multiple Choice

A team is developing a language model to predict the next word in a sentence. They find that their model assigns a probability of zero to the phrase 'the innovative chef prepares...' because it has never seen the specific two-word sequence 'innovative chef' in its training data, despite having seen 'innovative ideas' and 'master chef' many times. Which characteristic of a neural network-based approach to language modeling is specifically designed to overcome this type of generalization failure?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Data Science

Deep Learning (in Machine learning)

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science