Concept

Exploration/Exploitation trade-off

It is very challenging in the reinforcement learning to find a perfect balance between using previously used effective action(exploitation) and exploring new ones (exploration).

In order to solve this problem, simple ϵ\epsilon-Greedy algorithm can be used, in which we with some small probability ϵ\epsilon we would select action randomly (exploration) and with $1 - \epsilon$ probability use previously selected effective action(exploitation).

0

2

Updated 2021-07-08

Tags

Data Science

Learn After