Explain how to set up error analysis when dataset labels are suspected to be incorrect.
Question: Explain the guidelines for adding a mislabeled category to an error analysis spreadsheet as outlined in the text. Detail the condition that triggers this decision and list all six columns of the spreadsheet shown in the source.
Sample answer: According to the source, if you suspect the fraction of mislabeled images is significant, you should add a 'Mislabeled' category to keep track of the fraction of examples that are mislabeled. The resulting error analysis spreadsheet should feature these six columns: Image, Dog, Great cat, Blurry, Mislabeled, and Comments.
Key points:
- A 'Mislabeled' category is added when the fraction of mislabeled images is suspected to be significant.
- The purpose of the category is to track the fraction of examples that are mislabeled.
- The columns in the spreadsheet are Image, Dog, Great cat, Blurry, Mislabeled, and Comments.
Rubric: To earn full credit, the answer must: 1. State the condition for adding the category (suspecting the fraction of mislabeled images is significant), 2. State its purpose (to track the fraction of mislabeled examples), and 3. List all six columns correctly: Image, Dog, Great cat, Blurry, Mislabeled, and Comments.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
When should you add a 'Mislabeled' category to your error analysis spreadsheet?
True or False: You should add a 'Mislabeled' category to your error analysis spreadsheet only when you are certain—not merely suspicious—that examples are mislabeled.
When you suspect a significant fraction of dev set examples have incorrect labels, you should add a _____ column to your error analysis spreadsheet.
When should you add a 'Mislabeled' category to your error analysis spreadsheet according to Machine Learning Yearning?
The 'Mislabeled' column in the error analysis spreadsheet tracks examples where the ground-truth label itself is incorrect.
You should add a Mislabeled category to the error analysis spreadsheet when you suspect the _____ of mislabeled images is significant.
Match each column from the Machine Learning Yearning error analysis spreadsheet to what it captures.
Order the steps for setting up and using a Mislabeled column in an error analysis spreadsheet.
Which of the following is NOT listed as a column in the error analysis spreadsheet shown in Machine Learning Yearning (p. 33)?
According to Machine Learning Yearning, a 'Mislabeled' column should be included in every error analysis spreadsheet as a standard default.
The error analysis spreadsheet in Machine Learning Yearning has columns: Image, Dog, Great cat, Blurry, _____, and Comments.
Match each description to the corresponding column name in the Machine Learning Yearning error analysis spreadsheet.
Order the reasoning steps that lead a practitioner to add a 'Mislabeled' column rather than ignoring suspected label errors.
Explain how to set up error analysis when dataset labels are suspected to be incorrect.
Case study: Designing an error analysis spreadsheet for a dog classifier with incorrect labels.
Identify the tracking purpose and trigger for the 'Mislabeled' column.