Case Study

Evaluate the dev and test set distribution strategy for a mobile app development team experiencing slow progress.

Case context: A development team is building a commercial mobile app that recognizes hand-written notes. To speed up training, they collected high-resolution scanner images for their dev set, but they are evaluating their final system using low-resolution photos taken by mobile phones for their test set. The team is struggling with slow progress and inconsistent performance metrics.

Question: Based on Andrew Ng's recommendations for making progress on a specific machine learning application, diagnose the issue with the team's current dataset strategy and propose a solution to improve their efficiency.

Sample answer: The team's dev and test sets are drawn from different distributions (scanner images vs. mobile phone photos). For a specific machine learning application, drawing dev and test sets from different distributions reduces team efficiency. To make the team more efficient, they should change their strategy so that both the dev and test sets are drawn from the same distribution (the mobile phone photos). Developing algorithms to generalize across different distributions is an important research problem, but not suitable for team efficiency when developing a specific application.

Key points:

  • The team's dev and test sets are currently from different distributions.
  • This mismatch decreases the team's efficiency in making application progress.
  • The team should choose dev and test sets from the same distribution (mobile phone photos).

Rubric: The answer should identify that the dev and test sets are currently from different distributions, state that this mismatch reduces team efficiency, and propose drawing both dev and test sets from the same distribution (mobile phone photos) to align with application-focused progress.

0

1

Updated 2026-05-26

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Machine Learning Strategy

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Related