Learn Before
Concept
Off and On-Policy in MuZero
Since MuZero uses a replay buffer, one can consider it off-policy. However, viewed from the perspective of the behavior and training policies, it could also be considered on-policy.
0
1
Updated 2021-08-19
Contributors are:
Who are from:
Tags
Data Science