Case Study

Structuring diagnostic analysis for a new classifier across distinct data sources.

Case context: An ML team has trained a classifier on high-resolution web images (Distribution A) but plans to deploy it on low-resolution mobile uploads (Distribution B). They want to systematically understand how different types of errors relate to each other using an error table.

Question: Based on Andrew Ng's framework, how should the team structure their error table to evaluate the classifier, and what specific rows and columns should they include to gain insight into the algorithm's behavior?

Sample answer: The team should structure the table by placing the two data distributions (Distribution A and Distribution B) on the x-axis. On the y-axis, they should include three rows representing the error types: human-level error, the algorithm's error on examples it has trained on, and the algorithm's error on examples it has not trained on. Filling in this grid will help them compare the errors and diagnose what the algorithm is doing on the different distributions.

Key points:

  • Place the two data distributions on the x-axis.
  • Place the three error types on the y-axis.
  • The error types are human-level error, error on trained examples, and error on untrained examples.

Rubric: The answer must correctly place the two distributions on the x-axis and the three specific error types on the y-axis, noting that this structure helps diagnose the relationship between different error types.

0

1

Updated 2026-05-27

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI