Prioritizing Pre-training Efforts
A machine learning team is developing a new language model and is debating how to allocate their resources. They can either spend the next six months designing a novel, more sophisticated pre-training objective, or they can use their existing, simpler objective and focus all their efforts on acquiring a much larger dataset and securing more computational power for training. Based on the general principle demonstrated by the evolution of large-scale pre-trained models, which strategy is more likely to yield a more performant model? Justify your reasoning.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Evaluating a Language Model Pre-training Strategy
A research team is planning to pre-train a new language model with a fixed computational budget. One senior researcher argues, 'Instead of just using more data with our current simple masked language modeling objective, we should dedicate a significant portion of our budget to developing and implementing a novel, more complex pre-training task. This complexity is the key to unlocking better performance.' Based on the major findings from large-scale model training, which of the following statements provides the most accurate evaluation of this researcher's argument?
Prioritizing Pre-training Efforts