Explaining Unexpected Model Capabilities
A research team pre-trains a large language model exclusively on a massive corpus of public domain books and web articles. Before any subsequent training phase, they provide the model with the prompt: "Summarize the following paragraph in a single sentence: [Paragraph text]". To their surprise, the model produces a coherent and accurate summary. How can this behavior be explained, given that the model was never explicitly trained on summarization tasks?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Challenge of Opaque Pre-Training Data in Fine-Tuning
A machine learning engineer claims, "A language model's ability to follow instructions is exclusively a result of the targeted examples shown during its fine-tuning stage. The pre-training phase only provides it with general world knowledge and language structure."
Which of the following statements provides the most accurate evaluation of this claim?
Explaining Unexpected Model Capabilities
Explaining Emergent Zero-Shot Abilities