Catastrophic Forgetting in Fine-Tuning
Catastrophic forgetting describes a phenomenon where a neural network loses previously learned knowledge after being trained on new data. In the context of fine-tuning, this problem arises when adapting a model to a new task, which can cause a significant drop in its performance on the original task it was previously proficient in.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.1 Pre-training - Foundations of Large Language Models
Related
Catastrophic Forgetting in Fine-Tuning
Fine-Tuning Performance Analysis
A research team starts with a large language model pre-trained on a massive, diverse text corpus, which shows strong performance across a general language understanding benchmark. They then fine-tune this model on a small, highly specialized dataset for classifying medical research abstracts. After fine-tuning, the model achieves 99% accuracy on the medical abstract test set, but when re-evaluated on the original general language benchmark, its performance has dropped by 20%. What is the most likely explanation for this outcome?
Consequences of Specialized Fine-Tuning
Learn After
Mitigation Strategies for Catastrophic Forgetting
A team starts with a large language model that is highly proficient at a wide range of general language tasks, including text summarization, translation, and question-answering. They then fine-tune this model exclusively on a new, highly specialized dataset of legal document summaries. After this training, the model becomes excellent at summarizing legal documents but is now significantly worse at performing general translation than it was before. Which phenomenon does this scenario most directly demonstrate?
Diagnosing Performance Degradation in a Fine-Tuned Model
Illustrating a Key Fine-Tuning Challenge