Learn Before
Critique of a Pre-Training Strategy
A machine learning engineer proposes a new pre-training strategy for a large language model. They argue, 'Since the ability to learn from examples provided in a prompt is so powerful, we should explicitly include thousands of task-formatted examples (e.g., 'Translate this sentence: [sentence] -> [translation]') directly into our massive pre-training dataset. This will directly train the model to perform this skill.' Evaluate the engineer's proposal. In your evaluation, explain why this approach might not be the most effective way to develop this capability in a large model.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team observes that their language models below 10 billion parameters consistently fail to perform novel tasks when given a few examples in the input prompt. However, once the models are scaled beyond 50 billion parameters, they suddenly gain the ability to perform these tasks successfully using the provided examples, even though the pre-training objective and data distribution were not changed. Which statement best analyzes this observation?
Critique of a Pre-Training Strategy
Explaining Emergence in Language Models