Multiple Choice

A machine learning engineer is pre-training a large language model. They monitor the model's performance on a separate, unseen dataset after every 10,000 training steps. They observe the following trend:

  • Steps 1-100,000: Performance steadily improves.
  • Step 110,000: The model achieves its best performance so far.
  • Steps 120,000-150,000: Performance consistently worsens with each measurement.

Based on this observation, what is the most appropriate immediate action to ensure the best possible model is obtained from this training run?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science