Learn Before
Diagnose and resolve the dataset distribution gap in a mobile cat detector
Case context: You are building a cat image detector. Your training set consists of high-quality, clear internet images of cats. However, your dev set is collected from mobile app users who frequently move their cellphones while taking pictures, resulting in significant motion blur. Consequently, your detector performs poorly on the dev set.
Question: What data modification action should you take to improve the detector's performance on the dev set, and why?
Sample answer: You should apply simulated motion blur to the non-blurry internet images in the training set. This modifies the training data to align its distribution with the dev set, allowing the model to learn features robust to the motion blur caused by cellphone users' movements.
Key points:
- Apply simulated motion blur to the clear training images.
- Match the training set distribution to the blurry dev set distribution.
- Address the mismatch caused by mobile users moving their cellphones.
Rubric: Learners should identify simulated motion blur as the modification to apply to the training set and explain that it reduces the distribution gap between the clear training images and the blurry dev set images.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Related
In the cat image detector example, why do dev set images tend to have more motion blur than training images?
True or False: Adding simulated motion blur to non-blurry training images can reduce the distribution mismatch between training and dev sets.
To make training images more similar to a dev set with motion blur, you can add simulated _____ to non-blurry training images.
Why is simulated motion blur added to non-blurry training images in the cat detector example?
Motion blur in dev-set cat images is caused by cellphone users slightly moving their phone while taking pictures.
To close the distribution gap in the cat detector, you add _____ to non-blurry training images.
Match each dataset or image source to its characteristic in the cat detector example.
Order the steps to apply simulated motion blur to improve a cat detector's performance on a blurry dev set.
What specific distribution gap does adding simulated motion blur to training images address in the cat detector example?
Internet images used as cat detector training data typically have the same level of motion blur as cellphone-captured dev-set images.
In the cat detector example, the non-blurry training images that receive simulated blur originally come from _____ images.
Match each observation in the cat detector scenario to the corresponding explanation or action.
Order the reasoning steps that lead to the decision to use simulated motion blur as an artificial data synthesis technique.
Write an essay explaining how simulated motion blur helps align cat detector datasets
Diagnose and resolve the dataset distribution gap in a mobile cat detector
Explain why simulated motion blur is added to clear training images