Fine-Tuning Performance Analysis
A data science team starts with a large language model that excels at summarizing general news articles. They then adapt this model by training it exclusively on a large dataset of legal contracts to create a specialized legal summarizer. The new model achieves state-of-the-art performance on summarizing legal documents. However, when they later evaluate this specialized model on its original task of summarizing news articles, they find its performance has significantly degraded; the summaries are often convoluted and miss the main points. Analyze the most probable cause for this performance degradation on the original task.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Catastrophic Forgetting in Fine-Tuning
Fine-Tuning Performance Analysis
A research team starts with a large language model pre-trained on a massive, diverse text corpus, which shows strong performance across a general language understanding benchmark. They then fine-tune this model on a small, highly specialized dataset for classifying medical research abstracts. After fine-tuning, the model achieves 99% accuracy on the medical abstract test set, but when re-evaluated on the original general language benchmark, its performance has dropped by 20%. What is the most likely explanation for this outcome?
Consequences of Specialized Fine-Tuning