State the primary risk of fixing only misclassified dev set labels.
Question: What is the primary risk to your evaluation metrics if you only correct the labels of dev set examples that your system misclassified?
Sample answer: The primary risk is introducing bias into the dev set evaluation. Specifically, it will artificially inflate the measured performance of your system because you are only correcting errors in one direction (changing misclassified to correct) while ignoring errors that favor the model.
Key points:
- It introduces bias into the dev set evaluation.
- It artificially inflates the measured performance metrics of the system.
Rubric: The answer should state that it introduces evaluation bias, specifically an optimistic bias that artificially inflates the system's measured performance.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
Practical Convenience Causes Label-Correction Bias in Dev Sets
When Label-Correction Bias Is Acceptable Versus Problematic
What risk arises when you fix label errors only for the dev-set examples your classifier misclassified?
True or False: Correcting mislabeled dev-set examples only where your system was wrong produces an unbiased evaluation.
Fixing labels only on examples your system _____ can introduce bias into dev-set evaluation.
Why does fixing labels only on misclassified dev examples introduce bias into the evaluation?
Fixing labels only on misclassified dev examples can introduce bias into your evaluation.
To avoid label-correction bias, you should review labels of _____ dev examples, not only misclassified ones.
Match each label-correction practice to its effect on dev set evaluation bias.
Order the steps a team should follow when correcting dev set labels to avoid introducing bias.
What is the most likely effect on measured dev set accuracy when labels are corrected only on misclassified examples?
Reviewing only the dev examples your model misclassified is sufficient to ensure an unbiased dev set evaluation.
Label-correction bias arises because mislabeled examples the system classified _____ are never reviewed or fixed.
Match each term related to label-correction bias to its correct definition.
Order the reasoning steps that explain why fixing only misclassified labels introduces bias into dev set evaluation.
Explain how selective label correction on misclassified examples alters estimated dev set performance.
Evaluating a team's decision to correct only misclassified dev set labels.
State the primary risk of fixing only misclassified dev set labels.