Learn Before
Essay

Analyzing the impact of inconsistent auxiliary data on a target task

Question: Explain what makes an auxiliary data source 'inconsistent' with a target task, using the example of predicting New York City housing prices with Detroit data. Why would mixing these datasets be detrimental?

Sample answer: An auxiliary data source is inconsistent with a target task when identical input features map to completely different target labels depending on the data source. For instance, if the target task is predicting New York City housing prices based on house size, using Detroit housing data as an auxiliary source is inconsistent. Given the exact same house size (input feature x), the price (target label y) will be drastically different—much higher in NYC than in Detroit. Mixing these datasets would hurt the model's performance on the target task because the model would receive contradictory signals for the same input feature, hindering its ability to accurately learn the true mapping for the NYC market.

Key points:

  • Inconsistency occurs when the same input features imply different labels across datasets.
  • A house of the same size has very different prices in NYC versus Detroit.
  • Mixing inconsistent datasets provides contradictory training signals.
  • Including inconsistent auxiliary data hurts the model's performance on the target task.

Rubric: A strong response will define data inconsistency in terms of feature-to-label mapping, correctly apply the NYC vs. Detroit example, and explicitly state that the model receives contradictory signals that degrade performance on the target task.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI