Essay

Analyzing High Variance Using Training and Training-Dev Error Rates

Question: In a cat-detection system where the optimal error rate is near 0%, an algorithm yields a 1% training error, 5% training-dev error, and 5% dev error. Explain why this specific pattern of errors leads to the conclusion that the algorithm suffers from high variance rather than high bias.

Sample answer: The algorithm suffers from high variance because there is a large gap between the training error (1%) and the training-dev error (5%). The training error is very close to the optimal error rate (~0%), indicating that the model fits the training data well and does not have high bias (low avoidable bias). However, the performance degrades significantly on the training-dev set, which is drawn from the same distribution as the training set, demonstrating a failure to generalize. This failure is the hallmark of high variance.

Key points:

  • Training error is 1% while optimal error is 0%, indicating low bias.
  • Training-dev error is 5%, causing a significant gap from the 1% training error.
  • A large gap between training error and training-dev error indicates a failure to generalize to the same distribution, which defines high variance.

Rubric: A strong response should correctly identify that the small gap between training error and optimal error rules out high bias, while the large gap between training error and training-dev error indicates high variance.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI