Diagnosing Overfitting in a Cat Image Recognizer
Case context: You are developing a cat classification model. After training, your algorithm achieves a 1% error rate on your training set but an 11% error rate on your development set. An ideal human classifier achieves nearly 0% error.
Question: Based on the error metrics provided, diagnose the primary problem with this classifier. What specific ML terminology describes this situation, and what do the 1% and 10% values represent?
Sample answer: The primary problem is that the classifier is overfitting to the training data. The 1% training error represents the estimated bias, which is very low. The 10% difference between the dev set error (11%) and the training error (1%) represents the variance. Because the variance is high, the model is failing to generalize to new data.
Key points:
- Diagnose the problem as overfitting or failing to generalize.
- Identify the 1% metric as the estimated bias.
- Calculate or identify the 10% gap as the estimated variance.
- Recognize that high variance is the primary issue.
Rubric: A correct diagnosis must identify the issue as overfitting or high variance. It should correctly assign the 1% error to estimated bias and calculate the 10% difference as estimated variance.
0
1
Tags
Machine Learning
Deep Learning
Machine Learning Strategy
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Yearning @ DeepLearning.AI
Related
In the cat classification example, if training error is ~1% and dev error is 11%, what is the estimated variance?
A classifier with high variance has very low training error but fails to generalize to the dev set.
When a classifier has very low training error but high dev error, this is also called _____.
Match each metric or concept to its correct description in the cat classification bias-variance example.
Order the steps used in ML Yearning to diagnose high variance in the cat classification example.
Which pattern of training and dev set errors characterizes a high variance classifier in the cat example?
In ML Yearning's cat classification example, a classifier diagnosed with high variance has high training error.
In the cat classification example, variance is estimated as _____ minus training error.
Match each term to its role in diagnosing high variance from the cat classification example.
Order the reasoning steps to conclude the cat classifier is overfitting based on its error metrics.
Analyzing the Cat Classifier's Error Metrics
Diagnosing Overfitting in a Cat Image Recognizer
Terminology for High Variance Generalization Failure