Formula

Improved Power Law Formula for LLM Loss

The mathematical formulation for the improved scaling law incorporates an irreducible error term, ϵ\epsilon_{\infty}, into the basic power law, yielding the equation: L(x)=axb+ϵ\mathcal{L}(x) = ax^b + \epsilon_{\infty}. This equation is one of the most widely used forms for designing scaling laws in Large Language Models. In this expression, ϵ\epsilon_{\infty} represents the irreducible error resulting from unknown variables, which persists even as the variable of interest approaches infinity (xx \to \infty).

Image 0

0

1

Updated 2026-04-21

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences