Concept
Deep Q-learning
One value function we frequently used in value-based methods is the action-value function Q(s,a), which represents the total value of taking action a in state s. It is the sum of the future rewards r, adjusted by a discount factor gamma.
The basic steps of deep Q-learning algorithms:
- Train convolutional neural network to extract the essential features that can help the agent make the decision.
- Calculate the Q-Value of each possible action.
- Perform back-propagation to find the most accurate Q-Values.
0
1
Updated 2020-10-22
Tags
Data Science