Concept

On-policy vs Off-policy

On-policy methods attempt to evaluate or improve the policy that is used to make decisions. In contrast, off-policy methods evaluate or improve a policy different from that used to generate the data. Classicla on-policy methods include Sarsa Algorithm. As a comparison, Q-learning method is a classical off-policy method.

0

2

Updated 2021-08-18

Tags

Data Science

Learn After