1Cademy - Applying human comparison to a superhuman medical imaging system

Learn Before

Human-Better Data Subsets Can Drive Progress After Surpassing Average Human Performance

Case Study

Applying human comparison to a superhuman medical imaging system

Case context: Your ML system for detecting anomalies in X-rays surpasses human radiologists in average accuracy across the dev/test set. However, error analysis reveals that for a specific subset—pediatric patients—human radiologists still have a higher accuracy rate than your algorithm.

Question: How should you utilize this pediatric image subset to drive further progress in your system, despite its overall superhuman average performance?

Sample answer: Because humans still outperform the algorithm on the pediatric subset, this data can be used for human comparison. We can obtain higher quality labels from pediatric radiologists, draw on their intuition to understand why they correctly identified anomalies the system missed, and use their specific accuracy rate as a performance target.

Key points: