1Cademy - The Agent-Environment Interaction Loop in Reinforcement Learning

Learn Before

Bridging Language Modeling and Reinforcement Learning Notations in RLHF

Activity (Process)

The Agent-Environment Interaction Loop in Reinforcement Learning

The general framework of reinforcement learning is centered on an agent interacting with a dynamic environment. This interaction unfolds as a continuous cycle: at each step, the agent observes the environment's current state, selects an action according to its policy, executes that action, and then receives a reward and a new state from the environment as feedback. This iterative process of observing, acting, and receiving feedback forms the basis of learning.

Updated 2025-10-10

Contributors are: