1Cademy - Comparing Reward Model Performance

Learn Before

Empirical Reward Model Loss Formula

Case Study

Comparing Reward Model Performance

Analyze the following scenario and determine which of the two reward models is performing better on its respective dataset. Justify your answer by referencing the components of the empirical loss calculation.

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Impact of Data Distribution on Reward Model Training
A researcher is training a reward model using a small preference dataset, $\mathcal{D}_r$ , which contains exactly two preference pairs:
1. For input $\mathbf{x}_1$ , response $\mathbf{y}_{1a}$ is preferred over $\mathbf{y}_{1b}$ .
2. For input $\mathbf{x}_2$ , response $\mathbf{y}_{2a}$ is preferred over $\mathbf{y}_{2b}$ .
Given the empirical loss formula $\mathcal{L}_r(\phi) = -\frac{1}{|\mathcal{D}r|} \sum{(\mathbf{x},\mathbf{y}_a,\mathbf{y}_b)\in\mathcal{D}r} \log \text{Pr}{\phi}(\mathbf{y
Comparing Reward Model Performance
Empirical Reward Model Loss Formula using Bradley-Terry Model

Learn Before

Related