1Cademy - Diagnosing Flawed Agent Behavior

Learn Before

Reward Function in Reinforcement Learning

Case Study

Diagnosing Flawed Agent Behavior

An agent is being trained to navigate a maze and reach a specific goal location. After extensive training, the agent is observed to be taking an unnecessarily long, winding path to the goal, often revisiting the same locations multiple times before finally reaching the destination. Analyze the agent's reward structure provided below and explain the logical flaw that is most likely causing this inefficient behavior.

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related