Learn Before
Formula

Chinchilla Scaling Law Formula

Hoffmann et al. (2022) established a precise empirical equation for the Chinchilla scaling law to compute the test loss (L\mathcal{L}) based on the model size (NN) and the dataset size (DD). The formulation is expressed as:

L(N,D)=406.4N0.34model scaling+410.7D0.28dataset scaling+1.69irreducible error\mathcal{L}(N,D) = \underbrace{\frac{406.4}{N^{0.34}}}_{\text{model scaling}} + \underbrace{\frac{410.7}{D^{0.28}}}_{\text{dataset scaling}} + \underbrace{1.69}_{\text{irreducible error}}

This relationship divides the overall loss into three distinct components: a model scaling term, a dataset scaling term, and a baseline irreducible error of 1.69{}1.69.

Image 0

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences