1Cademy - Diagnosing Classifier Ranking Mismatch in Spam Detection

Learn Before

Changing Dev/Test Sets or the Metric When They No Longer Guide the Team

Case Study

Diagnosing Classifier Ranking Mismatch in Spam Detection

Case context: A development team is building a spam classifier. Their development set and accuracy metric rank Classifier A higher than Classifier B. However, the team manually reviews the results and realizes Classifier B is actually superior for the product because Classifier A lets too many highly offensive emails pass through. The team suspects their evaluation framework has missed the mark.

Question: Given this scenario, identify the warning sign that occurred, name the most likely cause of the issue among the three outlined by Andrew Ng, and state what change the team must make and what action they must take immediately after.

Sample answer: The warning sign is that the dev set plus metric ranks Classifier A higher, but the team thinks Classifier B is superior for the product. The most likely cause is that the evaluation metric is measuring something other than what the project needs to optimize (accuracy alone doesn't capture the product's need to block offensive emails). The team must quickly change the evaluation metric to align with their true objectives, and then make sure the entire team knows about the new direction.

Key points:

Identify the discrepancy in classifier ranking as the key warning sign.
Diagnose that the metric is no longer measuring what is most important to the project.
Recommend changing the metric quickly.
Ensure the team is informed of the new direction.

Rubric: The response must identify the classifier ranking discrepancy as the warning sign, specify that the metric optimizes the wrong objective as the cause, recommend changing the metric, and state the necessity of informing the team about the new direction.

Updated 2026-06-17

Contributors are:

Who are from:

References

Learn Before

Related