Learn Before
From Prediction to Understanding
A language model is trained exclusively on the task of predicting a single missing word in sentences from a massive, unlabeled text corpus. Explain how this simple objective forces the model to learn complex properties of language, such as grammar and semantic relationships, without any explicit human-provided labels.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An AI team trains a language model on a massive dataset of books and articles. The training process consists of a single, repeated task: the model is presented with a sentence where one word has been removed, and its goal is to predict the original missing word. The model is not given any other information or explicit rules about grammar or meaning. Based on this training method alone, what fundamental understanding is the model most likely developing?
Comparing Word Prediction Strategies
From Prediction to Understanding