Learn Before
Diagnosing a Performance Gap After Development
Case context: Your team has just finished a three-month development cycle for a new image classification algorithm. Throughout the process, the team continuously checked performance against the dev set, tweaking the architecture to maximize accuracy. When development concluded, the algorithm achieved 98% accuracy on the dev set but only 82% on the test set.
Question: Based on the provided scenario, what specific problem should you diagnose regarding your dataset evaluation, and what is the recommended next step to address this issue?
Sample answer: The team should diagnose that the algorithm has gradually overfit to the dev set due to repeated evaluation during the three-month cycle. This is evident from the massive performance gap (98% vs 82%) between the dev and test sets. The recommended next step is to get a fresh dev set to replace the one that the algorithm has overfit to.
Key points:
- Diagnose dev-set overfitting.
- Identify that the large gap between dev (98%) and test (82%) performance indicates the overfitting.
- Recommend getting a fresh dev set.
Rubric: Full credit is given for diagnosing dev-set overfitting due to repeated evaluations and prescribing the acquisition of a new dev set.
0
1
Tags
Machine Learning
Deep Learning
Machine Learning Strategy
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Yearning @ DeepLearning.AI
Related
Avoiding Test Set Decisions During Regular Progress Tracking
What causes an algorithm to gradually overfit to the dev set during the development process?
If dev set performance is much better than test set performance after development, this is a sign of dev-set overfitting.
When dev set performance is much better than test set performance, Machine Learning Yearning recommends you get a _____ dev set.
Match each concept related to dev-set overfitting to its correct description.
Arrange the steps of the full dev-set overfitting lifecycle in the correct order.
After development, your dev set performance is far better than your test set performance. What does Machine Learning Yearning recommend?
According to Machine Learning Yearning, you should regularly evaluate your algorithm on the test set throughout development to track progress.
The process of repeatedly evaluating ideas on the dev set causes your algorithm to gradually _____ to the dev set.
Match each verbatim phrase from Machine Learning Yearning to what it refers to or signifies.
Arrange the reasoning steps for diagnosing and responding to a dev-test performance gap in the correct order.
Analyzing the Cause and Solution for Dev Set Overfitting
Diagnosing a Performance Gap After Development
Identifying the Indicator of Dev Set Overfitting