1Cademy - Match each train/dev/test splitting term to its correct definition from Machine Learning Yearning.

Learn Before

70/30 Train/Test Split Heuristic Does Not Apply to Large Datasets

Matching

Match each train/dev/test splitting term to its correct definition from Machine Learning Yearning.

Updated 2026-06-29

Contributors are:

Who are from:

References

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI

For which dataset size is the 70/30 train/test split heuristic most appropriate?
True or False: As ML datasets grow to billions of examples, the fraction of data allocated to dev/test sets should grow proportionally to maintain the 70/30 split.
For large-scale ML problems, the _____ of data allocated to dev/test sets has been shrinking even as the absolute number of examples grows.
For which example-count range does the 70/30 train/test split heuristic work well, according to Andrew Ng?
As total dataset size grows into the billions, the fraction of data allocated to dev/test sets also increases.
The 70/30 train/test heuristic works well when you have a _____ number of examples, such as 100 to 10,000.
Match each dataset scale or concept to the correct dev/test sizing implication from Machine Learning Yearning.
Order the reasoning steps Ng uses to argue that the 70/30 heuristic does not apply to large datasets.
In big-data ML problems, what happens to the absolute number of dev/test examples as total dataset size grows?
Dev/test sets should be no larger than what is needed to evaluate the performance of your algorithms.
One popular heuristic was to use _____ of your data for your test set, though this does not apply to large datasets.
Match each train/dev/test splitting term to its correct definition from Machine Learning Yearning.
Order the steps a practitioner should follow when deciding test set size as their dataset grows from thousands to billions of examples.
Why does the traditional 70/30 train/test split heuristic fail to scale efficiently for big-data machine learning problems?
Evaluating dev and test set sizes for a billion-scale image classification system
Determining dev/test set size in the era of big-data machine learning

Learn Before

Related