The Unexpected Kitten Influx
Case context: You lead a machine learning team that just launched a cat-spotter app. Your initial dev/test set mainly contained pictures of adult cats. After the launch weekend, you review the logs and find that users are uploading a lot more kitten images than expected.
Question: Diagnose the primary issue with your evaluation metrics based on this user behavior.
Sample answer: The primary issue is that the current dev/test set distribution, which is mostly adult cats, is not representative of the actual distribution of images users are uploading (kittens). Therefore, the evaluation process on the initial dev/test set will not accurately reflect how well the system is doing on the actual distribution it needs to do well on.
Key points:
- Initial set mainly adult cats
- Actual user uploads feature many kittens
- Dev/test set is not representative of the actual distribution
Rubric: Correctly diagnoses that the dev/test set is no longer representative of the actual distribution of user uploads.
0
1
Tags
Machine Learning
Deep Learning
Machine Learning Strategy
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Yearning @ DeepLearning.AI
Related
Identifying Distribution Mismatch in Cat Apps
Dev/Test Set vs User Uploads
Unrepresentative Dev/Test Set After _____ Uploads
Real-World Distribution Mismatch Elements
Discovering a Distribution Mismatch
Addressing Distribution Mismatches in Cat Apps
The Unexpected Kitten Influx
Implications of Kitten Uploads
Evaluating Actual vs Dev/Test Distributions
Impact of Unexpected User Uploads