Activity (Process)

Agent-Environment Interaction Loop in Reinforcement Learning

The core of reinforcement learning is the interaction between an agent and a dynamic environment, which is modeled as a sequential process. At every time step, the agent assesses the environment's current state and uses its policy to select an action. After executing the action, the environment provides feedback consisting of a reward and a new state. This cycle of observing, acting, and receiving feedback continues until the agent accomplishes its objective.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related