Learn Before
Key studies on scaling pre-trained language models have concluded that fundamental architectural innovations are the primary driver of performance improvements, while simply increasing the amount of training data and computation offers diminishing returns and is generally less impactful.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Strategy for Model Improvement
A machine learning team has a well-performing language model and a fixed budget for one final improvement phase. They can either use the budget to engineer a new, complex architectural component or use it to triple the size of their training dataset and extend the training time. Based on the principles demonstrated by studies on scaling language models, which of the following is the most likely outcome?
Key studies on scaling pre-trained language models have concluded that fundamental architectural innovations are the primary driver of performance improvements, while simply increasing the amount of training data and computation offers diminishing returns and is generally less impactful.