1Cademy - Potential-Based Shaping Function Formula

Learn Before

Condition for Policy Invariance in Reward Shaping

Formula

Potential-Based Shaping Function Formula

To ensure that reward shaping does not alter the optimal policy, the shaping reward function $f$ must be defined as the difference between potential values of successive states. This is known as a potential-based shaping function, given by the formula: $f(s_t, a_t, s_{t+1}) = \gamma\Phi(s_{t+1}) - \Phi(s_t)$ Here, $\Phi$ is a real-valued potential function defined over the state space, and $\gamma$ is the discount factor. This specific form of $f$ guarantees that the optimality of the policy is preserved.