Short Answer

Validating Data Needs via Training Dev Sets

Question: When you include some data drawn from the dev/test set distribution in your training set, what hypothesis is further validated if you observe good performance on that specific training data but not on a similar training dev subset?

Sample answer: It validates the hypothesis that getting more data from that specific target distribution would help improve performance.

Key points:

  • Validates the hypothesis that getting more data from the target distribution would help.
  • Indicates the model can learn the distribution but is currently overfitting the available examples.

Rubric: The answer should state that it validates the need for more target-distribution data to improve generalization.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI