Multiple Choice

A research team is developing a powerful language model (a 'strong model') for a complex task. To guide its training, they first use a smaller, less capable model (a 'weak model'). They evaluate this weak model on a dedicated test set, where it achieves an accuracy of 72%. After the strong model is supervised by the weak model, the strong model achieves an accuracy of 85% on the same test set. In this scenario, what value represents the weak performance baseline (Pweak) used to measure the overall improvement?

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science