Case Study

Diagnosing Overfitting in a Cat Image Recognizer

Case context: You are developing a cat classification model. After training, your algorithm achieves a 1% error rate on your training set but an 11% error rate on your development set. An ideal human classifier achieves nearly 0% error.

Question: Based on the error metrics provided, diagnose the primary problem with this classifier. What specific ML terminology describes this situation, and what do the 1% and 10% values represent?

Sample answer: The primary problem is that the classifier is overfitting to the training data. The 1% training error represents the estimated bias, which is very low. The 10% difference between the dev set error (11%) and the training error (1%) represents the variance. Because the variance is high, the model is failing to generalize to new data.

Key points:

  • Diagnose the problem as overfitting or failing to generalize.
  • Identify the 1% metric as the estimated bias.
  • Calculate or identify the 10% gap as the estimated variance.
  • Recognize that high variance is the primary issue.

Rubric: A correct diagnosis must identify the issue as overfitting or high variance. It should correctly assign the 1% error to estimated bias and calculate the 10% difference as estimated variance.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Machine Learning Strategy

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Yearning @ DeepLearning.AI