1Cademy - Learn the Model in Model-Based Methods for Deep RL

Learn Before

Model-Based Methods for Deep Reinforcement Learning

Concept

Learn the Model in Model-Based Methods for Deep RL

In model-based reinforcement learning, the model may be known or learned. In the latter case, we run a base policy, like a random or any educated policy, and observe the trajectory.

run base policy $\pi_0(s_t, a_t)$ to collect $D = \{ (s,a,s')_i \}$
learn dynamics model $f(s,a)$ to minimize $\sum_i ||f(s_i,a_i) - s_i' ||^2$
backpropagate through $f(s,a)$ into the policy to optimize $\pi_{\theta} (s_t,a_t)$
run $\pi_{\theta} (s_t,a_t)$ add the resulting data $\{ (x,u,x')_j \}$ to $D$
repeat from step 2

In step 2 above, we use supervised learning to train a model to minimize the least square error from the sampled trajectory

In step 3, we can use the model to predict the next state given an action, then we use the policy to decide the next action, and use the state and action to computer the cost. Finally, we backpropagate the cost to train the policy.

We continue sample and fit the model as we move along the path.

0

1

Updated 2020-10-17

Contributors are:

Yue Kuang

🏆 2

Who are from:

University of Michigan - Ann Arbor

🏆 2

References

Reference for Deep Reinforcement Learning

Learn Before

Related