Case Study

Applying human comparison to a superhuman medical imaging system

Case context: Your ML system for detecting anomalies in X-rays surpasses human radiologists in average accuracy across the dev/test set. However, error analysis reveals that for a specific subset—pediatric patients—human radiologists still have a higher accuracy rate than your algorithm.

Question: How should you utilize this pediatric image subset to drive further progress in your system, despite its overall superhuman average performance?

Sample answer: Because humans still outperform the algorithm on the pediatric subset, this data can be used for human comparison. We can obtain higher quality labels from pediatric radiologists, draw on their intuition to understand why they correctly identified anomalies the system missed, and use their specific accuracy rate as a performance target.

Key points:

  • Recognize the pediatric subset as a domain where human-comparison applies.
  • Obtain better labels from humans for pediatric images.
  • Use human intuition to analyze system errors on pediatric images.
  • Set human performance on pediatric images as the desired target.

Rubric: Evaluates application of the three human comparison benefits (labels, intuition, targets) specifically to the pediatric image subset.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI

Related