Learn Before
Multiple Hypothesis Testing in Model Evaluation
When evaluating multiple classifiers, such as , on the same test dataset, the probability of obtaining a misleading test set performance score for at least one model increases significantly compared to evaluating a single classifier. For a single classifier , a practitioner might be highly confident that its empirical test error is close to its true population error . However, as the number of classifiers grows, the risk of a false discovery compounds, making it difficult to guarantee that the best-performing model did not simply achieve its seemingly low error rate by chance. This phenomenon directly relates to the statistical challenge of multiple hypothesis testing.
0
1
Tags
D2L
Dive into Deep Learning @ D2L