Learn Before
Short Answer

Calculating Discounted Return

An autonomous agent receives the following sequence of rewards starting from the next time-step (t+1): +5, +2, +10. The episode ends after the third reward. If the agent uses a discount factor (γ) of 0.5, what is the total discounted return (G_t) from the current time-step (t)?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science