Formula

Total Reward (Return)

In reinforcement learning, the total reward, also known as the return, represents the cumulative sum of rewards an agent receives over a sequence of time steps, typically an episode. It is formally defined by the equation: t=1Trt\sum_{t=1}^{T} r_t, where rtr_t is the reward at time step tt, and TT is the final time step. Maximizing this cumulative reward is the primary objective for the agent.

Image 0

0

1

Updated 2025-10-09

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences