1Cademy - Unit Reward Function for Segments

Learn Before

Segment-Based Rating Loss Function

Formula

Unit Reward Function for Segments

A simplified reward function can be implemented where the reward for any given segment is a constant value of 1. This is formally expressed as $r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}) = 1$ . In this model, the reward is independent of the prompt $\mathbf{x}$ , the complete response $\mathbf{y}$ , and the specific content of the segment $\bar{\mathbf{y}}$ .

Updated 2026-06-29

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related