1Cademy - Splitting a Large Dev Set into a Manually Examined Subset and a Hands-Off Subset

Learn Before

Error Analysis

Concept

Splitting a Large Dev Set into a Manually Examined Subset and a Hands-Off Subset

When the dev set is large, for example 5,000 examples with a 20% error rate yielding about 1,000 misclassified images, manually examining all misclassified examples takes a long time, so one may decide not to use all of them for error analysis. In that case, the dev set can be explicitly split into two subsets: one that is manually looked at and one that is not. The portion that is manually examined will be overfit more rapidly, while the portion that is not manually looked at can be used to tune parameters. Splitting the dev set in this explicit way also lets one tell when the manual error analysis process is causing overfitting of the examined portion.

Updated 2026-07-19

Contributors are:

Who are from:

References

Learn Before

Related

Learn After