1Cademy - Aggregated Reward as the Sum of Segment-Based Rewards

Learn Before

Notation for the RLHF Reward Model

Formula

Aggregated Reward as the Sum of Segment-Based Rewards

The total reward for a given input \mathbf{x} and a generated sequence \mathbf{y}, denoted as r(\mathbf{x}, \mathbf{y}), can be calculated by summing the individual rewards of its n constituent segments. This aggregation method is defined by the formula: $r(\mathbf{x}, \mathbf{y}) = \sum_{k=1}^{n} r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}_k)$ Here, r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}_k) represents the reward function for the k-th segment. This segment-level reward can depend on the initial input, the entire output sequence, and an average value \bar{\mathbf{y}}_k associated with that specific segment.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related

Learn After