Critique of an Arbitrary Shaping Function
A reinforcement learning engineer proposes adding a shaping function to an agent's reward to encourage faster learning. Their argument is: 'As long as my shaping function, f(s, a, s'), provides some extra positive reward for actions that seem directionally correct, it will only help the agent and won't change the ultimate goal.' Explain the fundamental flaw in this reasoning. Describe a simple, hypothetical scenario where a seemingly helpful shaping function could cause the agent to learn a final policy that is different from the one that would be optimal with the original rewards alone.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Potential-Based Shaping Function Formula
Analysis of a Flawed Reward Shaping Implementation
A reinforcement learning agent is being trained to navigate a maze. The original reward function provides a large positive reward only upon reaching the exit. To speed up learning, a developer adds a shaping reward function that gives a small, constant positive reward for every single action the agent takes, regardless of the state. After this change, the agent learns a new policy of moving in a perpetual loop instead of solving the maze. Why did adding this specific shaping reward alter the optimal policy?
Critique of an Arbitrary Shaping Function