Essay

Analyze the role of speech recognition as an example of an accelerating deep learning trend.

Question: In the context of end-to-end learning, explain why speech recognition is considered an example of "rich-output learning." Discuss the specific inputs and outputs involved, and how this relates to broader trends in modern deep learning as described in Machine Learning Yearning.

Sample answer: Speech recognition is considered rich-output learning because it takes audio as the input and produces a full text transcription as the output, rather than just a simple scalar value. This exemplifies an accelerating trend in deep learning where models can learn end-to-end to generate complex outputs like sentences, images, or audio, provided that the right labeled (input, output) pairs are available.

Key points:

  • Identifies audio as the input.
  • Identifies transcription as the output.
  • Notes that the output is richer than a single number.
  • Connects to the deep learning trend requiring the right (input, output) labeled pairs.

Rubric: A strong answer should clearly identify the input (audio) and output (transcription), explain that the output is richer than a single number, and connect this to the deep learning trend of training end-to-end models with appropriate labeled data pairs.

0

1

Updated 2026-05-27

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Machine Learning Strategy

Machine Learning Yearning @ DeepLearning.AI