Learn Before
Evaluating a Predictive Model for LLM Training
As an expert on the training dynamics of large models, explain to the project manager why their power-law based predictive model failed to anticipate this behavior. What fundamental characteristic of their chosen modeling function is at the root of this discrepancy?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Fitting LLM Learning Curves with Diverse Functions
A research team is modeling the performance of a large language model as they increase the amount of training data. Their predictive model, based on a standard power-law function, anticipates a steady, continuous improvement in performance. However, their experiments show that the model's error rate first decreases, then temporarily increases, before decreasing again. Which statement best analyzes the limitation of their predictive model in this context?
Evaluating a Predictive Model for LLM Training
Predicting Complex Learning Dynamics