Addressing Distribution Mismatches in Cat Apps
Question: Explain why a cat app evaluated primarily on adult cats might have an unrepresentative dev/test set if users upload many kitten pictures.
Sample answer: If an initial dev/test set mainly contains pictures of adult cats, the system is being evaluated on its ability to recognize adult cats. However, if users upload a lot more kitten images than expected after the app is shipped, the model encounters a different distribution of data than what it was tested on. This implies that the initial dev/test set distribution is not representative of the actual distribution the system needs to perform well on in the real world.
Key points:
- Dev/test set mainly contained adult cats
- Users uploaded unexpectedly high numbers of kittens
- Dev/test set distribution is not representative
- Mismatch with actual distribution
Rubric: The response must explain that evaluating on adult cats does not reflect real-world usage if users upload kittens, explicitly concluding that the dev/test set is not representative of the actual distribution.
0
1
Tags
Machine Learning
Deep Learning
Machine Learning Strategy
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Yearning @ DeepLearning.AI
Related
Identifying Distribution Mismatch in Cat Apps
Dev/Test Set vs User Uploads
Unrepresentative Dev/Test Set After _____ Uploads
Real-World Distribution Mismatch Elements
Discovering a Distribution Mismatch
Addressing Distribution Mismatches in Cat Apps
The Unexpected Kitten Influx
Implications of Kitten Uploads
Evaluating Actual vs Dev/Test Distributions
Impact of Unexpected User Uploads