Definition

Weak-to-Strong Performance (Pweak→strong)

Weak-to-Strong Performance, denoted as Pweak→strong, is a metric that measures the performance of a strong model on a test set after it has been fine-tuned using supervision from a weaker model. This metric is used to evaluate the effectiveness of the weak-to-strong fine-tuning process.

0

1

Updated 2026-05-01

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences