Multiple Choice

A research team is working to improve a large language model and is using the combined power law, L(N,D) = aN^b + cD^d + ε_∞, to guide their efforts. Their analysis shows that the term aN^b, which depends on the model's parameter count (N), is currently the largest contributor to the total loss. The term cD^d, which depends on the dataset size (D), is comparatively small. To achieve the most significant reduction in loss with their limited resources, what should the team prioritize?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science