1Cademy - An agent in a sequential decision-making process is at time step t and needs to select an action. The agents goal is to choose actions that maximize the sum of all future rewards. Given that the agent has already received rewards for all actions taken up to this point, how should the quantity represented by the expression $\sum_{k=1}^{t-1} r_k$ be considered when determining the optimal action at the current time step t?

Learn Before

Sum of Past Rewards Notation

Multiple Choice

An agent in a sequential decision-making process is at time step 't' and needs to select an action. The agent's goal is to choose actions that maximize the sum of all future rewards. Given that the agent has already received rewards for all actions taken up to this point, how should the quantity represented by the expression $\sum_{k=1}^{t-1} r_k$ be considered when determining the optimal action at the current time step 't'?

Updated 2025-10-07

Contributors are:

Who are from:

Learn Before

Related