1Cademy - Deep Q-learning

Learn Before

Value-based Methods for Deep Reinforcement Learning

Concept

Deep Q-learning

One value function we frequently used in value-based methods is the action-value function Q(s,a), which represents the total value of taking action a in state s. It is the sum of the future rewards r, adjusted by a discount factor gamma. $Q^{*}(s, a)=\max _{\pi} \mathbb{E}\left[r_{t}+\gamma r_{t+1}+\gamma^{2} r_{t+2}+\ldots \mid s_{t}=s, a_{t}=a, \pi\right]$

The basic steps of deep Q-learning algorithms:

Train convolutional neural network to extract the essential features that can help the agent make the decision.
Calculate the Q-Value of each possible action.
Perform back-propagation to find the most accurate Q-Values.

Updated 2020-10-22

Contributors are:

Ruobing Wang

🏆 4

Who are from:

University of Michigan - Ann Arbor

🏆 4

References

Slides from CMU: Introduction to Deep Reinforcement Learning
Complete Guide to Deep Reinforcement Learning: Concepts, Process, and Real World Applications

Learn After

Benefits of Deep Q-Learning

Learn Before

Related

Learn After