Learn Before
Case Study

Evaluating Cat Recognition Consistency Across Web and App Sources

Case context: A machine learning team is training a cat classifier for a mobile app. They have a target dataset of mobile app photos and access to a large auxiliary dataset of cat images downloaded from the internet. The team observes that a cat can be identified in an image regardless of whether the photo came from the web or a mobile app.

Question: Using the concept of consistent auxiliary data sources, analyze whether the team should include the internet images in their training data. State the mapping condition that defines this relationship, and identify what downside or upside they should expect.

Sample answer: The team should include the internet images because they are consistent with the mobile app images; a single function f(x) can reliably map from input x (the image) to target output y (cat vs. no cat) without knowing the image origin. There is little downside to this inclusion other than the extra computational cost, and there is a possible significant upside from having more data.

Key points:

  • Internet images are consistent with mobile app images because a single function f(x) maps inputs to labels regardless of source.
  • The origin of the image x does not need to be known to predict the label y.
  • Including this consistent auxiliary data has little downside other than computational cost, with potential significant upside.

Rubric: Grading Rubric:

  • Explains that the internet images should be included because they represent a consistent auxiliary source.
  • Explains the mapping condition: a function f(x) maps input x to label y without needing to know the data origin.
  • Notes that there is little downside other than computational cost, with a possible significant upside.

0

1

Updated 2026-05-27

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI