In the cat-app example, what percentage of dev/test data comes from internet images if all 210,000 images are randomly shuffled?
0
1
References
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
What is the main problem with randomly shuffling user and internet images together into dev/test sets for a cat app?
True or False: Randomly shuffling all available data into train/dev/test sets is recommended when your data sources have different distributions.
Dev and test sets should be chosen to reflect the _____ you expect to encounter in the future, not the overall shuffled data pool.
In the cat-app example, what percentage of dev/test data comes from internet images if all 210,000 images are randomly shuffled?
Andrew Ng recommends randomly shuffling all available data into dev/test sets even when the data sources differ from the target distribution.
In the cat-app example, randomly shuffling all data means about _____% of dev/test images would come from internet sources.
Match each dev/test set characteristic to the consequence it produces for the ML team.
Order the reasoning steps that explain why randomly shuffling mixed-source data into dev/test sets is problematic.
According to Andrew Ng, what is the primary criterion when choosing dev and test sets?
Randomly shuffling 5,000 user images with 205,000 internet images produces a dev/test set that accurately reflects the app-user distribution.
Andrew Ng recommends choosing dev and test sets to reflect data you expect to get in the _____ and want to do well on.
Match each data-partitioning scenario to the corresponding recommendation or outcome from Machine Learning Yearning.
Order the steps for correctly partitioning mixed-source data so that dev/test sets reflect the target distribution.
Analyze the consequences of shuffling mixed-source data into the dev and test sets.
Evaluate the data partitioning strategy for a cat-app using mixed-source data.
Why must dev/test sets reflect the target distribution instead of a shuffled mix?