1Cademy - Impact of Reward Model Flaws on Value Function Estimation

Learn Before

Reward Models as the Basis for Value Functions

Essay

Impact of Reward Model Flaws on Value Function Estimation

An agent is being trained to navigate a maze. Its reward model is designed to give a small positive signal for each step taken that does not hit a wall, and a large positive signal for reaching the exit. However, due to a flaw, the model also provides a moderately high positive signal for moving into a specific dead-end corridor. Analyze the likely effect of this flaw on the agent's computed long-term value for states within and near this corridor. How might this flawed value estimation, in turn, influence the agent's final learned path through the maze?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related