Learn Before
State the significance of the unseen training-distribution error.
Question: In the data mismatch example, the algorithm achieves a 1.5% error rate on unseen data from the training-set distribution. What does this specific metric tell you about the algorithm's performance?
Sample answer: It tells you that the algorithm generalizes well to the training-set distribution, meaning it does not suffer from a severe high variance (overfitting) problem on that specific distribution.
Key points:
- The algorithm successfully generalizes to the training data distribution.
- High variance (overfitting) on the training distribution is ruled out as the primary issue.
Rubric: The answer should mention that the algorithm generalizes well to the training distribution or that variance (overfitting) is not the main issue on that distribution.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
Which error comparison most directly reveals the data mismatch problem in the example with 1% train, 1.5% same-dist, and 10% dev error?
The error pattern (1% train, 1.5% same-dist unseen, 10% dev) primarily indicates a high variance (overfitting) problem.
In the data mismatch example, the algorithm achieves _____ error on the dev set.
Match each error measurement in the data mismatch example to its correct value.
Order the analytical steps used to diagnose the data mismatch problem in the Machine Learning Yearning example.
What does the 0.5% gap between training error (1%) and same-distribution unseen error (1.5%) indicate in the data mismatch example?
In the data mismatch example, the 1% training error vs. near-perfect human performance represents the most significant problem to address.
In the data mismatch example, error on unseen data drawn from the same distribution as the training set is _____.
Match each error comparison in the data mismatch example to the type of problem it diagnoses.
Order the observations that build the case that data mismatch — not overfitting or underfitting — is the dominant problem in the Machine Learning Yearning example.
Explain how specific error comparisons point to data mismatch.
Diagnose a 10% dev-set error in a cat recognition app.
State the significance of the unseen training-distribution error.