Essay

Explain the rationale and challenges of approximating future dev/test data before a mobile application launch.

Question: Before launching a mobile application, actual user data is not yet available to form dev/test sets. Explain why it is still critical to attempt to approximate future user data, and discuss how this approximation compares to the actual data you expect to face in the future.

Sample answer: Approximating future user data is necessary because without user data, you cannot construct dev/test sets that reflect the target distribution you need to perform well on in the future. By approximating it (e.g., having friends take mobile-phone cat pictures), you create a proxy dataset. However, the limitation is that this approximated data might still fail to perfectly reflect the actual future user data distribution once the app launches, but it is better than not having a representative set.

Key points:

  • Dev/test sets must reflect the target distribution the model needs to perform well on in the future.
  • Before launch, actual user data is unavailable, requiring approximation.
  • Approximation can be done using proxy methods, such as asking friends to take mobile-phone pictures.
  • Approximated data is only an approximation and may not perfectly represent final user data distributions.

Rubric: The answer should explain the need to approximate target distributions pre-launch to build dev/test sets, and note that the approximated data may not perfectly represent actual future user data.

0

1

Updated 2026-05-26

Contributors are:

Who are from:

Tags

Machine Learning

Deep Learning

Machine Learning Strategy

Supervised Learning

Dive into Deep Learning @ D2L

Data Science

Related