Explaining Unintended Model Capabilities
A company fine-tunes a large, pre-trained language model using a dataset composed exclusively of its internal customer service conversations. The goal is to create a chatbot that only answers product-related questions. After deployment, the company discovers that while the chatbot handles product questions well, it also successfully writes short stories and provides cooking recipes when prompted by users. Explain the most likely underlying reason for the model's ability to perform these out-of-scope tasks.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of Persistent General-Purpose Behavior: Math Fine-Tuning
Using Diverse Data to Steer LLM Specialization
A development team adapts a large, pre-existing language model to function as a specialized chatbot for a legal information service. The adaptation process uses a dataset consisting solely of legal questions and their corresponding factual answers. After deployment, the team finds that the chatbot accurately answers legal queries but also responds correctly when users ask it to write poems or summarize news articles. Which statement provides the most accurate explanation for the chatbot's behavior?
Diagnosing Unexpected Model Behavior
Explaining Unintended Model Capabilities