1Cademy - Formula for a Negative Reward Function $$r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}})$$

Learn Before

Average Value Notation ( $\bar{y}$ )
Examples of Constant Segment-Based Reward Functions

Formula

Formula for a Negative Reward Function $r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}})$

The function $r(\mathbf{x}, \mathbf{y}, \bar{\mathbf{y}}) = -1$ assigns a constant negative value. This function's arguments are the vectors $\mathbf{x}$ and $\mathbf{y}$ , and the average vector $\bar{\mathbf{y}}$ . In machine learning, a value of $-1$ often signifies a negative reward, a penalty, or indicates a negative sample.

Updated 2026-06-26

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn After

In a machine learning system, a penalty is assigned to every generated output segment using the function r(x, y, ȳ) = -1, where x is the input, y is the generated output, and ȳ is the average of all possible outputs. If the system generates an output segment, what is the value of the penalty it receives according to this function?
Evaluating a Reward Function for Story Generation

Learn Before

Related

Learn After