Match each train/dev/test splitting term to its correct definition from Machine Learning Yearning.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
For which dataset size is the 70/30 train/test split heuristic most appropriate?
True or False: As ML datasets grow to billions of examples, the fraction of data allocated to dev/test sets should grow proportionally to maintain the 70/30 split.
For large-scale ML problems, the _____ of data allocated to dev/test sets has been shrinking even as the absolute number of examples grows.
For which example-count range does the 70/30 train/test split heuristic work well, according to Andrew Ng?
As total dataset size grows into the billions, the fraction of data allocated to dev/test sets also increases.
The 70/30 train/test heuristic works well when you have a _____ number of examples, such as 100 to 10,000.
Match each dataset scale or concept to the correct dev/test sizing implication from Machine Learning Yearning.
Order the reasoning steps Ng uses to argue that the 70/30 heuristic does not apply to large datasets.
In big-data ML problems, what happens to the absolute number of dev/test examples as total dataset size grows?
Dev/test sets should be no larger than what is needed to evaluate the performance of your algorithms.
One popular heuristic was to use _____ of your data for your test set, though this does not apply to large datasets.
Match each train/dev/test splitting term to its correct definition from Machine Learning Yearning.
Order the steps a practitioner should follow when deciding test set size as their dataset grows from thousands to billions of examples.
Why does the traditional 70/30 train/test split heuristic fail to scale efficiently for big-data machine learning problems?
Evaluating dev and test set sizes for a billion-scale image classification system
Determining dev/test set size in the era of big-data machine learning