1Cademy - The Unexpected Kitten Influx

Learn Before

Kitten Uploads Revealing Dev/Test Distribution Mismatch

Case Study

The Unexpected Kitten Influx

Case context: You lead a machine learning team that just launched a cat-spotter app. Your initial dev/test set mainly contained pictures of adult cats. After the launch weekend, you review the logs and find that users are uploading a lot more kitten images than expected.

Question: Diagnose the primary issue with your evaluation metrics based on this user behavior.

Sample answer: The primary issue is that the current dev/test set distribution, which is mostly adult cats, is not representative of the actual distribution of images users are uploading (kittens). Therefore, the evaluation process on the initial dev/test set will not accurately reflect how well the system is doing on the actual distribution it needs to do well on.

Key points: