Diagnosing Variance in a Cat-Detection Algorithm
Case context: You are evaluating a cat-detection algorithm. You know that human-level performance is nearly perfect, setting the optimal error rate at about 0%. After running your model, you observe a 1% error on the training set, a 5% error on the training-dev set, and a 5% error on the dev set.
Question: Based on the provided error metrics, what is the primary diagnosis for the algorithm's performance, and which specific error comparison justifies this conclusion?
Sample answer: The primary diagnosis is that the algorithm has high variance. This is justified by comparing the 1% training error to the 5% training-dev error. Since the 1% training error is already very close to the 0% optimal error, bias is not the main issue. The 4% increase in error when moving to the training-dev set shows the model struggles to generalize to new data from the same distribution.
Key points:
- Primary diagnosis is high variance.
- Justification relies on the gap between the 1% training error and 5% training-dev error.
- Notes that the training error (1%) is close to the optimal error (~0%), ruling out high bias.
Rubric: The answer must explicitly state 'high variance' as the diagnosis and justify it by highlighting the gap between the 1% training error and 5% training-dev error.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
What primary problem does a model with 1% training error and 5% training-dev error exhibit when optimal error is ~0%?
True or False: When training error is 1%, training-dev error is 5%, and optimal error is ~0%, the algorithm primarily has high bias.
When training error is 1% and training-dev error is 5%, with near-zero optimal error, the algorithm has high _____.
Match each error metric to its role in diagnosing high variance in the cat detection scenario.
Order the steps for diagnosing the primary error type when given training, training-dev, and dev error values.
Which gap in the error analysis directly signals high variance in a model with 1% training error, 5% training-dev error, and 5% dev error?
True or False: In the ML Yearning cat detection example used to illustrate high variance, the dev error and training-dev error are both 5%.
The avoidable _____ is only ~1% in this scenario because training error (1%) is very close to optimal error (~0%).
Match each diagnostic gap to the problem type it reveals in the cat detection error analysis.
Order the reasoning steps that lead to the conclusion 'this model has high variance' given the cat detection error values.
Analyzing High Variance Using Training and Training-Dev Error Rates
Diagnosing Variance in a Cat-Detection Algorithm
Indicator of High Variance from Error Sets