Concept
Off-Policy Value Target
Across all environments, provides faster initial convergence speeds. However, if we compare M0FFV runs with M0GB runs, the paper finds that using solely this target is not enough, and needs to be combined with the on-policy value target.
0
1
Updated 2021-08-19
Tags
Data Science