1Cademy - MuZero

Learn Before

Fundamental Concepts for Reinforcement Learning

Concept

MuZero

MuZero is a model-based reinforcement learning method which contains a representation function, a dynamics function, and a prediction function.The representation function encodes an observation sequence into a hidden state. The dynamics function evaluates the current hidden state and a potential action to determine the potential next hidden state and a reward value. Lastly, the prediction function is the policy generation function. They use similar parameters and are optimized according to a combined loss function.

A Monte Carlo Tree Search (MCTS) is often used when training to evaluate action trees from the current/ root state.

Updated 2021-08-19

Contributors are:

Justin McIntosh

🏆 5