Case Study

Diagnosing performance difference between training and dev sets in a speech recognition system

Case context: You train a speech recognition model that achieves high accuracy on your training data. However, when evaluating the model on your dev set, you find that the model performs poorly, indicating a data mismatch problem.

Question: Based on Andrew Ng's recommendations, what specific analysis should you carry out to begin resolving this data mismatch problem?

Sample answer: To begin resolving this data mismatch problem, you should try to understand which properties of the data differ between the training set and the dev-set distributions.

Key points:

  • Recognize that the mismatch arises from differing training and dev-set distributions.
  • Propose analyzing the data to understand which specific properties differ between the two sets.

Rubric: The learner must state that the next recommended step is to try to understand what properties of the data differ between the training set and the dev-set distributions.

0

1

Updated 2026-05-26

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Related