Learn Before
If a team has a stable and effective training process for a 10-billion-parameter language model, they can expect the same process to work reliably without significant modifications when applied to a 100-billion-parameter model, provided they have proportionally increased the computational resources and dataset size.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Key Issues in Large-Scale LLM Training
Training Instability in Large-Scale LLMs
Enabling Role of Deep Learning Infrastructure in LLM Development
Evaluating a Training Strategy for a Large-Scale Model
A machine learning team has successfully trained a 1-billion-parameter language model. They now plan to train a new 100-billion-parameter model using a proportionally larger dataset. Based on common experiences with scaling up, which of the following represents the most critical and often unexpected challenge they are likely to encounter with the larger model's training process?
If a team has a stable and effective training process for a 10-billion-parameter language model, they can expect the same process to work reliably without significant modifications when applied to a 100-billion-parameter model, provided they have proportionally increased the computational resources and dataset size.
Computing Resources and Costs for Scaling LLM Training