1Cademy - Why must dev/test sets reflect the target distribution instead of a shuffled mix?

Learn Before

Avoid Randomly Shuffling Mixed-Source Data into Dev/Test Sets

Short Answer

Why must dev/test sets reflect the target distribution instead of a shuffled mix?

Question: According to the cat-app example, why does randomly shuffling 205,000 internet images and 5,000 user images into dev/test sets fail to reflect the target distribution?

Sample answer: Randomly shuffling the data makes about 97.6% of the dev/test sets consist of internet images. This does not reflect the actual app-user distribution that we expect to get in the future and want to do well on.

Key points:

Shuffling results in dev/test sets being about 97.6% internet images.
The resulting dev/test sets fail to reflect the target app-user distribution.

Rubric: The response must mention that shuffling results in about 97.6% internet images in the dev/test sets, which fails to reflect the target app-user distribution.

Updated 2026-07-05

Contributors are:

Who are from:

References

Learn Before

Related