Learn Before
Choosing Dev and Test Sets from the Same Distribution When Possible
Once dev and test sets are defined, a team will focus on improving dev set performance, so the dev set should reflect the task one wants to improve on the most. Having different dev and test set distributions can lead to a system that works well on the dev set but does poorly on the test set. If the dev and test sets came from the same distribution, that result would have a clear diagnosis: the system has overfit the dev set, and the obvious cure is to get more dev set data. If the dev and test sets come from different distributions, the options are less clear: the system may have overfit the dev set, the test set may be harder than the dev set, or the algorithm might be doing as well as could be expected.
0
1
References
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
Purpose of Dev and Test Sets
Choosing Dev and Test Sets to Reflect Future Data
Choosing Dev and Test Sets from the Same Distribution When Possible
Changing Dev/Test Sets or Evaluation Metrics During a Project Is Common
Quickly Establishing Dev/Test Sets and Metric for New Applications
Sizing the Dev Set to Detect Meaningful Accuracy Changes
What is the primary purpose of a development (dev) set in a machine learning project?
The development set is sometimes called the 'hold-out cross validation set'.
The development set is sometimes also called the _____ cross validation set.
Match each development set role to its description in Machine Learning Yearning.
Order the steps for using a dev set to choose between two model configurations.
Which of the following is NOT a stated use of the development set in Machine Learning Yearning?
The dev set and the training set refer to the same data partition in Machine Learning Yearning.
The dev set is used to tune parameters, _____ features, and make other decisions about the learning algorithm.
Match each dev set use from Machine Learning Yearning to the activity that exemplifies it.
Arrange the stages of a machine learning project that involve the dev set in the correct workflow order.
Explain the role and alternative terminology for the development set.
Determine the correct data partition for tuning a model's parameters and features.
List the three primary uses of the development set.
Learn After
Dev Set Should Reflect the Task to Improve Most
Same-Distribution Dev/Test Failure Indicates Dev Set Overfitting
Different Dev/Test Distributions Make Failure Diagnosis Ambiguous
Third-Party Benchmark Distribution Mismatch Increases Luck
When dev and test sets share the same distribution and test performance is worse than dev performance, what does this clearly indicate?
True or False: When dev and test sets come from different distributions, a performance gap between them has a single, unambiguous diagnosis.
When a system has overfit the dev set and both sets share the same distribution, the obvious cure is to get more _____ data.
Why should the dev set reflect the task a team wants to improve on the most?
If both sets share the same distribution and a model performs well on dev but poorly on test, the clear diagnosis is dev set overfitting.
When a model overfits the dev set and both sets share the same distribution, the obvious cure is to get more _____ data.
Match each dev/test set scenario to its consequence for model diagnosis.
Order the diagnostic steps when a model works well on the dev set but fails on the test set.
Which is a possible explanation for poor test performance when dev and test sets come from different distributions?
When dev and test sets come from different distributions, a system's failure on the test set provides an unambiguous diagnosis.
Once the dev and test sets are defined, a team will be focused on improving _____ set performance.
Match each concept related to dev/test distribution to its correct description.
Order the steps for selecting dev and test sets that support clear model evaluation.
Compare the diagnostics of poor test performance under same vs. different dev/test distributions.
Diagnosing a drop in test set performance with mismatched distributions.
Identify the diagnosis and cure for poor test performance when distributions match.