1Cademy - Avoiding Test Set Decisions During Regular Progress Tracking

Learn Before

Dev Set Overfitting from Repeated Evaluation

Concept

Avoiding Test Set Decisions During Regular Progress Tracking

A team may evaluate the system on the test set regularly to track progress, such as weekly or monthly. The test set should not be used to make algorithm decisions, including whether to roll back to a previous system. Using the test set for such decisions can cause overfitting to the test set, so the test set can no longer be counted on to give a completely unbiased estimate of system performance.

Updated 2026-06-14

Contributors are:

Who are from:

References

Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)
Machine Learning Yearning (Deeplearning.ai)

Learn After

What is the primary risk of using the test set to make algorithm decisions, such as whether to roll back to a previous system version?
It is acceptable to use the test set to decide whether to roll back your ML system to a previous version, provided you do so infrequently.
According to Andrew Ng, if you use the test set to make algorithm decisions, you will start to _____ to the test set, rendering it unable to give a completely unbiased estimate of system performance.
Which use of the test set is acceptable when tracking a team's ML progress?
Using the test set to decide whether to roll back to a previous system will cause overfitting to the test set.
When algorithm decisions are based on test set results, the team will start to _____ to the test set.
Match each test set usage to its correct classification as acceptable or problematic.
Order the steps showing how repeated algorithm decisions based on test set scores lead to an unreliable performance estimate.
Why does Machine Learning Yearning emphasize keeping the test set free from algorithm decisions?
Evaluating your system on the test set once per week to track progress is an acceptable practice.
After using the test set for algorithm decisions, it can no longer give a completely _____ estimate of system performance.
Match each concept to its definition as used in the context of test set integrity.
Order the recommended steps for correctly using the test set while maintaining its integrity during ML development.
Explain the consequences of using test set evaluation scores to make algorithmic rollback decisions.
Diagnose the methodological error in a team's test set evaluation process.
What is the risk of using test set performance to guide algorithm rollback decisions?

Learn Before

Related

Learn After