1Cademy - Analyzing Reward Function Invariance

Learn Before

Reward Transformation Formula

Short Answer

Analyzing Reward Function Invariance

Consider two reward functions, r and r', related by the equation r'(s_t, a_t, s_{t+1}) = r(s_t, a_t, s_{t+1}) + f(s_t, a_t, s_{t+1}). Explain why it is possible for an agent to learn the exact same optimal behavior under both r and r', even when the function f is not always zero. What does this phenomenon reveal about the challenge of inferring a single, true reward function from observing an agent's actions?

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related