Learn Before
Strategy for Model Improvement
A research lab has a language model with a well-established architecture. They have achieved a certain level of performance but have a budget to make one final push for significant improvement. They are debating two competing strategies. Evaluate the two proposals below and determine which is more likely to yield a substantial performance boost, justifying your choice based on key findings from large-scale model training experiments.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Strategy for Model Improvement
A machine learning team has a well-performing language model and a fixed budget for one final improvement phase. They can either use the budget to engineer a new, complex architectural component or use it to triple the size of their training dataset and extend the training time. Based on the principles demonstrated by studies on scaling language models, which of the following is the most likely outcome?
Key studies on scaling pre-trained language models have concluded that fundamental architectural innovations are the primary driver of performance improvements, while simply increasing the amount of training data and computation offers diminishing returns and is generally less impactful.