1Cademy - Performance Gap Recovered (PGR)

Experiment A: The less powerful model scores 50% on a task. The powerful model, after learning from the less powerful one, scores 70%. The powerful model&#x27;s maximum possible score on this task is 90%.
Experiment B: The less powerful model scores 70% on a different task. The powerful model, after learning from the less powerful one, scores 78%. The powerful model&#x27;s maximum possible score on this task is 80%.

Learn Before

Utility of Weak Models in Assisting Stronger Models
Weak Performance (Pweak) as a Baseline Metric
Weak-to-Strong Performance (Pweak→strong)
Strong Ceiling Performance (Pceiling)

Definition

Performance Gap Recovered (PGR)

Performance Gap Recovered (PGR) is a metric used to evaluate the effectiveness of weak-to-strong generalization. It quantifies the extent to which the performance gap between a weak model's baseline and a strong model's theoretical maximum performance (the ceiling) is closed after the strong model is supervised by the weak one.