The Interplay of Overfitting and Knowledge Loss in Model Tuning
Explain the relationship between a language model overfitting to a new dataset during a fine-tuning process and the phenomenon where the model loses its previously acquired general knowledge. In your explanation, describe how intensive training on a narrow dataset can lead to both of these outcomes.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Strategies to Mitigate Overfitting and Catastrophic Forgetting in SFT
Fine-Tuning Performance Degradation
A team fine-tunes a large, pre-trained language model, known for its strong general knowledge, on a highly specialized dataset of legal contracts. They train the model for a very large number of iterations. After fine-tuning, the model demonstrates exceptional performance in generating and interpreting legal text but now provides nonsensical or incorrect answers to simple, general knowledge questions it could easily answer before. What is the most likely explanation for this change in the model's behavior?
The Interplay of Overfitting and Knowledge Loss in Model Tuning