Concept

Off-Policy Value Target γ\gamma

Across all environments, provides faster initial convergence speeds. However, if we compare M0FFV runs with M0GB runs, the paper finds that using solely this target is not enough, and needs to be combined with the on-policy value target.

0

1

Updated 2021-08-19

Tags

Data Science