Once a large language model training process is effectively parallelized across a distributed system, there is no longer a significant need to employ additional speedup or compression techniques.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Mixed Precision Training
Optimizing a Large Model Training Pipeline
When training a large language model, why might a team employ techniques such as model compression or mixed precision training even when they are already using a large-scale distributed system?
Once a large language model training process is effectively parallelized across a distributed system, there is no longer a significant need to employ additional speedup or compression techniques.