Learn Before
Evaluating a Compute Budgeting Strategy
A startup is developing a language model specialized for analyzing legal contracts. The CEO finds a well-documented scaling law from a large tech company that accurately predicted the performance of their general-purpose web-text model based on compute budget. The CEO proposes using this exact scaling law to forecast the budget needed for their legal model to reach a target performance level. Critically evaluate this proposal. What is the primary risk of this approach, and why?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Limitations of Monotonic Scaling Functions
Limitation of Test Loss in Predicting Downstream Performance
A research team develops a scaling function that accurately predicts their language model's performance on English text as they increase the model's parameter count. Confident in their findings, they use the same function to budget for a new, larger model intended for generating computer code. However, the final code-generation model performs significantly worse than the function predicted. Which statement best explains this outcome?
Evaluating a Compute Budgeting Strategy
A research lab has developed a scaling function that accurately predicts the performance of their specific 10-billion parameter language model on a large corpus of web text. This function can therefore be considered a reliable predictor for the performance of any other 10-billion parameter language model trained on a different large corpus of web text.