1Cademy - In the cat-app example, what percentage of dev/test data comes from internet images if all 210,000 images are randomly shuffled?

Learn Before

Avoid Randomly Shuffling Mixed-Source Data into Dev/Test Sets

Multiple Choice

In the cat-app example, what percentage of dev/test data comes from internet images if all 210,000 images are randomly shuffled?

Updated 2026-06-19

Contributors are:

Who are from:

References

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI

What is the main problem with randomly shuffling user and internet images together into dev/test sets for a cat app?
True or False: Randomly shuffling all available data into train/dev/test sets is recommended when your data sources have different distributions.
Dev and test sets should be chosen to reflect the _____ you expect to encounter in the future, not the overall shuffled data pool.
In the cat-app example, what percentage of dev/test data comes from internet images if all 210,000 images are randomly shuffled?
Andrew Ng recommends randomly shuffling all available data into dev/test sets even when the data sources differ from the target distribution.
In the cat-app example, randomly shuffling all data means about _____% of dev/test images would come from internet sources.
Match each dev/test set characteristic to the consequence it produces for the ML team.
Order the reasoning steps that explain why randomly shuffling mixed-source data into dev/test sets is problematic.
According to Andrew Ng, what is the primary criterion when choosing dev and test sets?
Randomly shuffling 5,000 user images with 205,000 internet images produces a dev/test set that accurately reflects the app-user distribution.
Andrew Ng recommends choosing dev and test sets to reflect data you expect to get in the _____ and want to do well on.
Match each data-partitioning scenario to the corresponding recommendation or outcome from Machine Learning Yearning.
Order the steps for correctly partitioning mixed-source data so that dev/test sets reflect the target distribution.
Analyze the consequences of shuffling mixed-source data into the dev and test sets.
Evaluate the data partitioning strategy for a cat-app using mixed-source data.
Why must dev/test sets reflect the target distribution instead of a shuffled mix?

Learn Before

Related