Explain why dev set label quality becomes more important over time.
Question: Briefly explain why mislabeled examples in the dev set can become a more pressing issue as a machine learning algorithm's performance improves.
Sample answer: As the algorithm's performance improves and overall errors decrease, the fraction of errors caused by mislabeled dev examples grows relative to the total set of errors. This eventually adds significant noise to accuracy estimates, making it worthwhile to improve label quality.
Key points:
- Overall errors decrease as the system improves.
- The relative fraction of mislabeled errors grows.
- Significant error is added to accuracy estimates.
Rubric: The answer must state that the relative fraction of mislabeled errors grows as the system improves, which distorts accuracy estimates.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
Why do mislabeled dev set examples become more impactful as a classifier improves?
It is acceptable to initially tolerate mislabeled dev/test examples and reconsider that decision as the system improves.
When mislabeled dev examples account for _____ of all errors, improving dev-set label quality becomes worthwhile.
Match each scenario to its correct implication regarding mislabeled dev set examples.
Order the reasoning steps for deciding whether to invest in fixing mislabeled dev set labels.
A classifier has ~2% dev error; 30% of those errors stem from mislabeled dev images. What should you do?
The difference between a classifier error of 1.4% and 2% is a minor detail with little practical significance.
As a classifier improves, the fraction of errors due to mislabeled dev examples _____ relative to total errors.
Match each concept to its role in the growing relative impact of mislabeled dev examples.
Order the stages of how mislabeled dev examples grow in importance across a classifier's development lifecycle.
Analyze the changing impact of mislabeled dev examples over a model's lifecycle.
Decide whether to clean dev set labels for a highly accurate classifier.
Explain why dev set label quality becomes more important over time.