Short Answer

Comparing Reward Structures in AI Training

Consider two scenarios for training an AI agent. In Scenario A, an agent learns to navigate a maze and receives a small positive reward for every step that brings it closer to the exit and a small negative reward for hitting a wall. In Scenario B, an agent learns to write a short story and only receives a reward after the entire story is written, based on its overall quality. Compare the reward structures in these two scenarios. Identify which scenario uses a dense reward structure and which uses a sparse one, and analyze the primary training challenge associated with the sparse reward scenario.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science