Short Answer

Predicting Performance Improvement from Model Scaling

A research lab's language models follow the power-law relationship L(N)N0.076L(N) \propto N^{-0.076}, where L is the test loss and N is the number of parameters. If they increase the number of parameters in their next model by a factor of 100, what is the expected percentage decrease in the test loss? Show your calculation.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science