Learn Before
Definition

Policy in Reinforcement Learning (π\pi)

A policy, denoted by π\pi, defines an agent's behavior by mapping states to actions. In stochastic policies, π(as)\pi(a|s) represents the probability of taking action aa while in state ss. Policies are central to reinforcement learning as they are the component that is optimized to maximize rewards.

Image 0

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related
Learn After