Essay

Addressing Distribution Mismatches in Cat Apps

Question: Explain why a cat app evaluated primarily on adult cats might have an unrepresentative dev/test set if users upload many kitten pictures.

Sample answer: If an initial dev/test set mainly contains pictures of adult cats, the system is being evaluated on its ability to recognize adult cats. However, if users upload a lot more kitten images than expected after the app is shipped, the model encounters a different distribution of data than what it was tested on. This implies that the initial dev/test set distribution is not representative of the actual distribution the system needs to perform well on in the real world.

Key points:

  • Dev/test set mainly contained adult cats
  • Users uploaded unexpectedly high numbers of kittens
  • Dev/test set distribution is not representative
  • Mismatch with actual distribution

Rubric: The response must explain that evaluating on adult cats does not reflect real-world usage if users upload kittens, explicitly concluding that the dev/test set is not representative of the actual distribution.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Machine Learning Strategy

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Yearning @ DeepLearning.AI