Explain what is measured by evaluating on the training dev set.
Question: In the four-dataset evaluation framework, what does evaluating the algorithm on the training dev set specifically measure?
Sample answer: Evaluating the algorithm on the training dev set measures its ability to generalize to new data drawn from the training-set distribution.
Key points:
- It evaluates the ability to generalize to new data.
- The new data must be drawn from the training-set distribution.
Rubric: The answer must state that the training dev set evaluates the algorithm's ability to generalize to new data drawn from the training-set distribution.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
Which dataset is used to evaluate an algorithm's ability to generalize to new data drawn from the training set distribution?
Training error is evaluated by running the algorithm on the training set.
The algorithm's performance on the task you care about is evaluated using the _____ and/or test sets.
Match each dataset in the four-dataset framework to its primary evaluation purpose.
Order the four dataset evaluations from measuring training error through to final target-task performance as described in ML Yearning.
According to ML Yearning, which dataset combination evaluates the algorithm's performance on the task you care about?
The training dev set is drawn from the same distribution as the dev and test sets.
In the four-dataset framework, the _____ dev set evaluates generalization to new data drawn from the training distribution.
Match each evaluation goal to the dataset that measures it in ML Yearning's four-dataset framework.
Order the diagnostic reasoning steps a practitioner follows when using the four-dataset framework to identify an algorithm's key weakness.
Describe how training error, generalization, and target-task performance are evaluated in a four-dataset framework.
Diagnose classifier performance using the four-dataset evaluation framework.
Explain what is measured by evaluating on the training dev set.