Correlation

Test Loss Scaling with Dataset Size

A key finding of language model scaling laws is the inverse correlation between the training dataset size and the model's final test loss. As the dataset size (D) increases, the test loss (L) decreases according to a power-law relationship. This relationship is often visualized on a log-log plot, where it appears as a nearly straight line, indicating a predictable improvement in model performance with more data.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences