Learn Before
Comparing Word Prediction Strategies
A language model can be trained by predicting the next word in a sequence or by predicting a masked word in the middle of a sequence. Analyze how these two different word prediction objectives might lead the model to develop different internal representations of language. Discuss the potential strengths and weaknesses of each approach for tasks requiring deep understanding versus tasks requiring coherent text generation.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An AI team trains a language model on a massive dataset of books and articles. The training process consists of a single, repeated task: the model is presented with a sentence where one word has been removed, and its goal is to predict the original missing word. The model is not given any other information or explicit rules about grammar or meaning. Based on this training method alone, what fundamental understanding is the model most likely developing?
Comparing Word Prediction Strategies
From Prediction to Understanding