1Cademy - Return

Learn Before

Math Behind Reinforcement Learning

Concept

Return

Our final goal is to choose actions over time so that we could maximize the expected value of the return. The definition of return is as below: The return $G_t$ is the total discounted reward from time-step t. $G_t = R_{t+1} + γR_{t+2} + · · · = \sum_{k=0}^{∞}γ^kR_{t+k+1}$ , where γ is the discounted factor. When γ close to 0 leads to “myopic” evaluation; γ close to 1 leads to “far-sighted” evaluation.