Example Data Mismatch Error Pattern
A data mismatch problem can appear when humans achieve near perfect performance, the algorithm has 1% training error, 1.5% error on unseen data from the training-set distribution, and 10% dev-set error.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
Example Data Mismatch Error Pattern
Finding Training Data That Better Matches Difficult Dev Examples
Addressing Data Mismatch by Comparing Data Properties
What defines data mismatch in the context of training and dev/test sets?
Data mismatch means the algorithm fails to generalize even to new data drawn from the training distribution.
Data mismatch is named because the training set data is a poor _____ for the dev/test set data.
Match each data mismatch term to its correct description.
Order the steps for diagnosing a data mismatch problem from training through evaluation.
What is the root cause of a data mismatch problem between training and dev/test sets?
A speech recognition system that does well on training and training dev sets but poorly on the dev set has a data mismatch problem.
An algorithm with data mismatch generalizes well to the _____ distribution but not to the dev/test distribution.
Match each performance pattern to its diagnostic interpretation regarding data mismatch.
Order the reasoning chain that leads from observation to a data mismatch conclusion.
Explain how generalization performance differences between training and dev/test distributions indicate data mismatch.
Diagnose generalization issues in a speech recognition system.
Explain the naming origin of the data mismatch problem.
Learn After
Which error comparison most directly reveals the data mismatch problem in the example with 1% train, 1.5% same-dist, and 10% dev error?
The error pattern (1% train, 1.5% same-dist unseen, 10% dev) primarily indicates a high variance (overfitting) problem.
In the data mismatch example, the algorithm achieves _____ error on the dev set.
Match each error measurement in the data mismatch example to its correct value.
Order the analytical steps used to diagnose the data mismatch problem in the Machine Learning Yearning example.
What does the 0.5% gap between training error (1%) and same-distribution unseen error (1.5%) indicate in the data mismatch example?
In the data mismatch example, the 1% training error vs. near-perfect human performance represents the most significant problem to address.
In the data mismatch example, error on unseen data drawn from the same distribution as the training set is _____.
Match each error comparison in the data mismatch example to the type of problem it diagnoses.
Order the observations that build the case that data mismatch — not overfitting or underfitting — is the dominant problem in the Machine Learning Yearning example.