Learn Before
A research team observes that their language models below 10 billion parameters consistently fail to perform novel tasks when given a few examples in the input prompt. However, once the models are scaled beyond 50 billion parameters, they suddenly gain the ability to perform these tasks successfully using the provided examples, even though the pre-training objective and data distribution were not changed. Which statement best analyzes this observation?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team observes that their language models below 10 billion parameters consistently fail to perform novel tasks when given a few examples in the input prompt. However, once the models are scaled beyond 50 billion parameters, they suddenly gain the ability to perform these tasks successfully using the provided examples, even though the pre-training objective and data distribution were not changed. Which statement best analyzes this observation?
Critique of a Pre-Training Strategy
Explaining Emergence in Language Models