1Cademy - Advantage Function Definition

Learn Before

State-Value Function as a Baseline
Action-Value Function Formula

Formula

Advantage Function Definition

The advantage function, $A(s_t, a_t)$ , quantifies the relative benefit of taking a specific action $a_t$ compared to the expected value of following the policy from state $s_t$ onward. It is formally defined as the difference between the action-value function, $Q(s_t, a_t)$ , and the state-value function, $V(s_t)$ : $A(s_t, a_t) = Q(s_t, a_t) - V(s_t)$ A positive advantage value suggests the action is better than the expected policy value, while a negative value suggests it is worse. This measure is crucial in methods like A2C as it helps focus policy updates on actions likely to improve performance.