Learn Before
Formula

Cumulative Future Reward (Return)

The cumulative future reward, often called the return, represents the total reward an agent accumulates from a specific time step tt until the final time step TT. It is a key metric in reinforcement learning for assessing the long-term value of actions or states. The return is calculated by summing the rewards from time step tt onwards, as shown by the formula: k=tTrk\sum_{k=t}^{T} r_k, where rkr_k represents the reward received at time step kk.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related