Formula

Segment-Based Rating Loss Function

When segment-level rating scores are available, a reward model can be trained using pointwise methods and a regression loss function. This loss function calculates the negative expected squared difference between the target rating score for a segment and the reward model's predicted score. The formula is expressed as: Lrating=Eyˉk[s(yˉk)r(x,y,yˉk)]2\mathcal{L}_{\mathrm{rating}} = -\mathbb{E}_{\bar{\mathbf{y}}_k} \big[ s(\bar{\mathbf{y}}_k) - r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}_k) \big]^2 In this equation, s(yˉk)s(\bar{\mathbf{y}}_k) is the target rating score for segment yˉk\bar{\mathbf{y}}_k, and r(x,y,yˉk)r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}_k) is the reward predicted by the model for that segment given the prompt x\mathbf{x} and full output y\mathbf{y}.

Image 0

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences