1Cademy - A machine learning engineer is pre-training a large language model. They monitor the models performance on a separate, unseen dataset after every 10,000 training steps. They observe the following trend: - Steps 1-100,000: Performance steadily improves. - Step 110,000: The model achieves its best performance so far. - Steps 120,000-150,000: Performance consistently worsens with each measurement. Based on this observation, what is the most appropriate immediate action to ensure the best possible

Learn Before

Performance Degradation and Early Stopping in Pre-training

Multiple Choice

A machine learning engineer is pre-training a large language model. They monitor the model's performance on a separate, unseen dataset after every 10,000 training steps. They observe the following trend:

Steps 1-100,000: Performance steadily improves.
Step 110,000: The model achieves its best performance so far.
Steps 120,000-150,000: Performance consistently worsens with each measurement.

Based on this observation, what is the most appropriate immediate action to ensure the best possible

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related