Case Study

The Unexpected Kitten Influx

Case context: You lead a machine learning team that just launched a cat-spotter app. Your initial dev/test set mainly contained pictures of adult cats. After the launch weekend, you review the logs and find that users are uploading a lot more kitten images than expected.

Question: Diagnose the primary issue with your evaluation metrics based on this user behavior.

Sample answer: The primary issue is that the current dev/test set distribution, which is mostly adult cats, is not representative of the actual distribution of images users are uploading (kittens). Therefore, the evaluation process on the initial dev/test set will not accurately reflect how well the system is doing on the actual distribution it needs to do well on.

Key points:

  • Initial set mainly adult cats
  • Actual user uploads feature many kittens
  • Dev/test set is not representative of the actual distribution

Rubric: Correctly diagnoses that the dev/test set is no longer representative of the actual distribution of user uploads.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Machine Learning Strategy

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Yearning @ DeepLearning.AI