Learn Before
Evaluating Cat Recognition Consistency Across Web and App Sources
Case context: A machine learning team is training a cat classifier for a mobile app. They have a target dataset of mobile app photos and access to a large auxiliary dataset of cat images downloaded from the internet. The team observes that a cat can be identified in an image regardless of whether the photo came from the web or a mobile app.
Question: Using the concept of consistent auxiliary data sources, analyze whether the team should include the internet images in their training data. State the mapping condition that defines this relationship, and identify what downside or upside they should expect.
Sample answer: The team should include the internet images because they are consistent with the mobile app images; a single function f(x) can reliably map from input x (the image) to target output y (cat vs. no cat) without knowing the image origin. There is little downside to this inclusion other than the extra computational cost, and there is a possible significant upside from having more data.
Key points:
- Internet images are consistent with mobile app images because a single function f(x) maps inputs to labels regardless of source.
- The origin of the image x does not need to be known to predict the label y.
- Including this consistent auxiliary data has little downside other than computational cost, with potential significant upside.
Rubric: Grading Rubric:
- Explains that the internet images should be included because they represent a consistent auxiliary source.
- Explains the mapping condition: a function f(x) maps input x to label y without needing to know the data origin.
- Notes that there is little downside other than computational cost, with a possible significant upside.
0
1
Tags
Machine Learning
Deep Learning
Supervised Learning
Dive into Deep Learning @ D2L
Data Science
Machine Learning Strategy
Machine Learning Yearning @ DeepLearning.AI
Related
What is the defining condition for an auxiliary data source to be 'consistent' with a target task?
In ML Yearning's cat example, a model must know whether an image came from the internet or a mobile app to correctly label it.
An auxiliary data source is consistent with the target task when the same _____ works across both sources.
Match each term to its role in the concept of a consistent auxiliary data source.
Order the steps for determining whether an auxiliary data source is consistent with a target task.
According to ML Yearning, what is the main practical downside of including a consistent auxiliary data source in training?
ML Yearning states that including a consistent auxiliary source offers 'little downside' and 'some possible significant upside.'
In ML Yearning's cat example, internet images are a consistent auxiliary source because f(x) predicts the label without knowing the image _____.
Match each ML Yearning concept to its correct description in the context of consistent auxiliary data sources.
Order the reasoning steps that justify including a consistent auxiliary data source in model training.
Analyzing Consistency and Cost of Auxiliary Data in ML
Evaluating Cat Recognition Consistency Across Web and App Sources
Role of Image Origin in Consistent Data Mapping