Diagnose the 10%/11%/20% error scenario
Question: An algorithm exhibits a 10% training error, an 11% training-dev error, and a 20% dev error. Assuming human-level error is near 0%, analyze the algorithm's performance issues based on these three error metrics. Explain which specific ML problems the algorithm suffers from and which it does not.
Sample answer: The algorithm suffers from high avoidable bias and data mismatch, but it does not suffer from high variance on the training set distribution. The high training error (10%) indicates high avoidable bias. The small 1% gap between the training error and the training-dev error (11%) shows that the algorithm does not suffer from high variance on the training distribution. Finally, the large 9% gap between the training-dev error and the dev error (20%) demonstrates data mismatch.
Key points:
- Identifies high avoidable bias due to 10% training error.
- Identifies lack of high variance due to small gap between training and training-dev error (10% vs 11%).
- Identifies data mismatch due to large gap between training-dev error and dev error (11% vs 20%).
Rubric: The response must identify high avoidable bias, data mismatch, and the lack of high variance. It should justify each using the specific gaps between the provided error metrics.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
What does the 1% gap between training error (10%) and training-dev error (11%) indicate in the ML Yearning scenario?
True or False: An algorithm with 10% training error, 11% training-dev error, and 20% dev error suffers from high variance on the training set distribution.
With 10% training error, 11% training-dev error, and 20% dev error, the algorithm suffers from high avoidable bias and _____, but not high variance.
Match each error gap from the 10%/11%/20% scenario to the ML problem it diagnoses.
Order the diagnostic reasoning steps used to conclude the 10%/11%/20% algorithm has avoidable bias and data mismatch but not high variance.
Which combination of problems does an algorithm with 10% training error, 11% training-dev error, and 20% dev error have (assuming ~0% human-level error)?
True or False: In the 10%/11%/20% scenario, data mismatch contributes a larger performance drop than variance when going from training to dev error.
In the 10%/11%/20% scenario, the gap between training error and _____ error is used to assess variance on the training set distribution.
Match each ML problem type to the error evidence that confirms or rules it out in the 10%/11%/20% scenario.
Order the three error measurements from lowest to highest as reported in ML Yearning's high-avoidable-bias and data-mismatch scenario.
Diagnose the 10%/11%/20% error scenario
Analyzing Error Metrics for Data Mismatch
Identify the missing error problem