Scaling Up Manual Review
Case context: A data science team is debugging their image classifier. They have a massive dataset and the model is generating thousands of errors. The lead engineer wants to manually analyze 500 errors to get an extremely detailed breakdown of the failure modes, but a junior engineer argues they only need to look at 100.
Question: Based on best practices for Eyeball dev sets, evaluate the lead engineer's proposal. Is it acceptable, and why?
Sample answer: The lead engineer's proposal is acceptable. While reviewing about 100 mistakes is sufficient to get a very good sense of the major sources of errors, manually analyzing more errors (such as 500) is perfectly fine. The key condition for doing this without harm is having enough data, which the team currently possesses with their massive dataset.
Key points:
- 100 mistakes is enough for major error sources
- Analyzing 500 errors is acceptable
- The condition of having enough data is met
Rubric: The answer should validate that looking at 100 mistakes is the baseline, but clearly state that looking at 500 is acceptable provided there is enough data.
0
1
Tags
Machine Learning
Deep Learning
Machine Learning Strategy
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Yearning @ DeepLearning.AI
Related
Sufficient Mistakes for Error Sense
Harm in Analyzing More Errors
Baseline Number for _____ Mistakes
Matching Error Quantities to Concepts
Decision Process for Eyeball Dev Set Review
Analyzing Eyeball Dev Set Error Volumes
Scaling Up Manual Review
Primary Benefit of 100 Mistakes
Condition for Reviewing More Errors
Sense of Error Sources from 100 Mistakes