Learn Before
Evaluating an LLM Training Strategy
Based on the principles of configuring large model training, evaluate the engineer's conclusion and their recommended course of action. Justify your evaluation.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A machine learning team is training a new 10-billion-parameter language model on a novel, specialized dataset. They meticulously copy the exact training configuration (optimizer, learning rate schedule, parallelism strategy) from a famous research paper that successfully trained a model of a similar size. After several days, their training run becomes unstable and the model's performance collapses. What is the most probable explanation for this failure?
Evaluating an LLM Training Strategy
A research lab has a fixed computational budget to train a new large language model for a specific scientific domain. They have developed a promising initial configuration but are uncertain if it is optimal. Which of the following strategies represents the most effective and prudent use of their budget, given the complexities of establishing a stable and efficient training process?