1Cademy - Reward Transformation Formula

Learn Before

Underdetermined Model

Formula

Reward Transformation Formula

This formula defines a transformed reward function, $r'$ , based on an original reward function, $r$ . The new reward is calculated by adding an arbitrary function, $f$ , to the original reward. All functions are dependent on the current state ( $s_t$ ), action ( $a_t$ ), and the subsequent state ( $s_{t+1}$ ). The mathematical expression is: $r'(s_t, a_t, s_{t+1}) = r(s_t, a_t, s_{t+1}) + f(s_t, a_t, s_{t+1})$ This demonstrates how alternative reward functions can be generated, which is a core aspect of why reward models can be underdetermined.