Learn Before
A reinforcement learning agent controls a robot vacuum cleaner. The primary goal is for the robot to collect all pieces of trash in a room as quickly as possible. Which of the following reward function designs would be most effective at encouraging the desired behavior without leading to unintended negative consequences?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Diagnosing Flawed Agent Behavior
A reinforcement learning agent controls a robot vacuum cleaner. The primary goal is for the robot to collect all pieces of trash in a room as quickly as possible. Which of the following reward function designs would be most effective at encouraging the desired behavior without leading to unintended negative consequences?
Designing a Grid World Reward Function