1Cademy - Pointwise Loss Function for Reward Model Training

Learn Before

Pointwise Method (Rating) for Human Feedback in RLHF

Formula

Pointwise Loss Function for Reward Model Training

Training a pointwise reward model involves minimizing a loss function that measures the discrepancy between the model's predicted reward, $r(\mathbf{x}, \mathbf{y})$ , and the actual score provided by human annotators, $\phi(\mathbf{x}, \mathbf{y})$ . This process is framed as a regression task. The loss function is typically based on mean squared error (MSE) or other regression losses. For instance, a loss function using MSE would be formulated as: $\mathcal{L}_{\text{point}} = \mathbb{E}[(\phi(\mathbf{x}, \mathbf{y}) - r(\mathbf{x}, \mathbf{y}))^2]$ By minimizing this loss, the model learns to produce rewards that closely match the absolute scores assigned by humans.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

References

Learn Before

Related

Learn After