Learn Before
Establishing a Performance Baseline
In the context of the following scenario, identify the 'Weak Performance' (Pweak) value and explain its primary purpose in the experiment.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Performance Gap Recovered (PGR)
Establishing a Performance Baseline
A research team is developing a powerful language model (a 'strong model') for a complex task. To guide its training, they first use a smaller, less capable model (a 'weak model'). They evaluate this weak model on a dedicated test set, where it achieves an accuracy of 72%. After the strong model is supervised by the weak model, the strong model achieves an accuracy of 85% on the same test set. In this scenario, what value represents the weak performance baseline (Pweak) used to measure the overall improvement?
The Role of a Baseline in Model Evaluation