1Cademy - Analyze why expanding the training dataset fails to resolve high training-set error.

Learn Before

Adding Training Data Does Not Help Much When Training Error Is High

Essay

Analyze why expanding the training dataset fails to resolve high training-set error.

Question: According to the course principles, explain why adding more training data is an ineffective technique when a model's training error is significantly higher than the target error, and describe what the developer must focus on instead.

Sample answer: When training error is high, the primary issue is high avoidable bias. Adding more training data is designed to resolve variance problems (where the model fails to generalize to new data) but does not improve the model's capacity to fit the existing training set. Therefore, it has no significant effect on reducing bias. Instead, the developer must first focus on improving the algorithm's performance on the training set before expecting any improvements on the dev or test sets.

Key points:

Adding more training data helps with variance problems, not bias.
A high training error relative to target error indicates avoidable bias.
Adding training data has no significant effect on reducing bias.
Practitioners must first improve performance on the training set.

Rubric: Response should clearly state that adding more data helps with variance but not bias/training error. It must specify that the model has high avoidable bias, and that the developer must prioritize improving training set performance before dev/test set performance.

0

1

Updated 2026-06-12

Contributors are:

Who are from:

References

Learn Before

Related