1Cademy - Agent-Environment Interaction Loop in Reinforcement Learning

Learn Before

Fundamental Concepts for Reinforcement Learning
Bridging Language Modeling and Reinforcement Learning Notations in RLHF

Activity (Process)

Agent-Environment Interaction Loop in Reinforcement Learning

The core of reinforcement learning is the interaction between an agent and a dynamic environment, which is modeled as a sequential process. At every time step, the agent assesses the environment's current state and uses its policy to select an action. After executing the action, the environment provides feedback consisting of a reward and a new state. This cycle of observing, acting, and receiving feedback continues until the agent accomplishes its objective.

Updated 2026-05-02

Contributors are: