Diagnosing Generalization Failure in a Legal AI
Based on the case study below, analyze the likely cause of the model's poor performance and explain which dimension of training data diversity is lacking.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An AI team is building a general-purpose chatbot. They train two different models on a large dataset of text summarization tasks.
- Model A is trained using 100,000 different articles, but every training example uses the exact same instruction: "Summarize the following text."
- Model B is trained using only 10,000 different articles, but the training examples use 1,000 varied instructions for summarization (e.g., "Give me the gist," "What are the key points?," "Provide a brief overview.").
When a user gives the prompt, "Can you give me the TL;DR for this article?", which model is more likely to fail at the task, and what is the most probable reason for its failure?
Diagnosing Generalization Failure in a Legal AI
Diagnosing a Model's Generalization Failure