Learn Before
Explaining Emergence in Language Models
A large language model is trained solely on the objective of predicting the next word in a vast corpus of text. After training, it is discovered that the model can perform tasks like sentiment analysis or translation simply by being shown a few examples in its input, without any further training or parameter updates. Explain why this newly discovered capability is described as an 'emergent ability' rather than a directly trained one.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team observes that their language models below 10 billion parameters consistently fail to perform novel tasks when given a few examples in the input prompt. However, once the models are scaled beyond 50 billion parameters, they suddenly gain the ability to perform these tasks successfully using the provided examples, even though the pre-training objective and data distribution were not changed. Which statement best analyzes this observation?
Critique of a Pre-Training Strategy
Explaining Emergence in Language Models