1Cademy - Policy in Reinforcement Learning ($\pi$)

Agent 1: When in state &#x27;S&#x27;, it is programmed to always choose the action &#x27;move North&#x27;.
Agent 2: When in state &#x27;S&#x27;, it is programmed to choose &#x27;move North&#x27; with 70% probability and &#x27;move East&#x27; with 30% probability.

Learn Before

Information

Definition

Policy in Reinforcement Learning ( $\pi$ )

A policy, denoted by $\pi$ , defines an agent's behavior by mapping states to actions. In stochastic policies, $\pi(a|s)$ represents the probability of taking action $a$ while in state $s$ . Policies are central to reinforcement learning as they are the component that is optimized to maximize rewards.