Subsets of Error Sources in Machine Learning Algorithms
Question: In machine learning development, must avoidable bias, variance, and data mismatch always occur together, or can an algorithm suffer from only a specific subset of these three problems?
Sample answer: An algorithm does not need to suffer from all three problems together. It is possible for an algorithm to suffer from any subset of high avoidable bias, high variance, and data mismatch, meaning it can exhibit any combination of these errors (e.g., just variance and data mismatch, or all three, or only one) depending on the model and data.
Key points:
- Avoidable bias, variance, and data mismatch do not have to occur together.
- An algorithm can suffer from any subset of high avoidable bias, high variance, and data mismatch.
Rubric: The answer should state that errors do not have to occur together and that a model can suffer from any subset or combination of avoidable bias, variance, and data mismatch.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
High Avoidable Bias and Data Mismatch Without High Variance
Which statement best describes how avoidable bias, variance, and data mismatch can affect a single learning algorithm?
True or False: A learning algorithm can exhibit high avoidable bias and data mismatch at the same time without necessarily having high variance.
According to Machine Learning Yearning, it is possible for an algorithm to suffer from any _____ of high avoidable bias, high variance, and data mismatch.
Which statement best describes how high avoidable bias, high variance, and data mismatch can co-exist in a single algorithm?
An algorithm can exhibit high variance and data mismatch simultaneously, without suffering from high avoidable bias.
It is possible for an algorithm to suffer from any _____ of high avoidable bias, high variance, and data mismatch.
Match each of the three error sources to the comparison that most directly reveals it.
Order the diagnostic steps for identifying which subset of the three error sources affects an algorithm.
Training error equals human-level error, training-dev error closely matches training error, but dev error is far higher. Which subset of problems is present?
An algorithm must always exhibit all three problems—high avoidable bias, high variance, and data mismatch—together; they cannot occur in isolation.
When training error ≈ human-level and training-dev ≈ training error, but dev error is much higher, the algorithm suffers from data _____ as its primary problem.
Match each two-problem combination to the diagnostic error-gap pattern it produces.
Order the reasoning steps for planning improvements when an algorithm is diagnosed with all three problems simultaneously.
Explaining the Co-existence of Avoidable Bias, Variance, and Data Mismatch
Diagnosing Co-existing Errors in a Speech Recognition System
Subsets of Error Sources in Machine Learning Algorithms