1Cademy - Comparing Single-Variable Scaling Functions

Learn Before

Modeling LLM Performance with Scaling Functions

Short Answer

Comparing Single-Variable Scaling Functions

A research team has developed two separate mathematical functions to model their language model's performance. Function A describes the model's final loss solely as a function of the training dataset size (while holding model size constant). Function B describes the model's final loss solely as a function of the number of model parameters (while holding dataset size constant). Explain why relying on only one of these functions could lead to a suboptimal training strategy for a new, larger model.

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related