Case Study

Scaling a speech recognition system

Case context: A team has built a speech recognition system using a moderately sized neural network and a dataset of 10,000 hours of audio. The performance has plateaued. They have the budget to either significantly increase the size of their neural network or collect 100,000 more hours of audio, but they cannot do both immediately.

Question: Based on the principle that the best performance comes from having both a very large network and huge data, what should the team diagnose about their ultimate goal, even if they must take steps sequentially?

Sample answer: The team should diagnose that achieving the absolute 'best performance' will ultimately require them to secure both the very large neural network and the huge amount of data. Scaling only one dimension will have limits, so their long-term roadmap must account for expanding both the network capacity and the dataset volume.

Key points:

  • Diagnose that scaling only one factor has limits.
  • Recognize the ultimate goal is scaling both dimensions.
  • Best performance requires a very large network AND huge data.

Rubric: Evaluates if the learner understands that scaling both dimensions is ultimately necessary for the best performance.

0

1

Updated 2026-06-13

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI