1Cademy - Role of Context in Segment-Based Reward

Learn Before

Segment-Based Reward Score Formula

Short Answer

Role of Context in Segment-Based Reward

In the context of calculating a reward score for a segment of a generated text, the formula is often expressed as $r^k = r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}_k)$ Explain why the complete generated output, $\mathbf{y}$ , is included as an input to the reward model $r$ when the goal is to score only a specific segment, $\bar{\mathbf{y}}_k$ .

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related