Short Answer

Troubleshooting a Model Training Process

A machine learning engineer is training a large language model on a text corpus. After several iterations, they observe that the model's total loss value is fluctuating but not showing a consistent downward trend. Assuming the loss calculation itself is correct, which core component of the iterative optimization procedure is most likely misconfigured or malfunctioning, and why does this lead to the observed behavior?

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science