Formula

Sum of Past Rewards Notation

The mathematical expression k=1t1rk\sum_{k=1}^{t-1} r_k represents the total sum of rewards, denoted by rkr_k, collected from the first time step (k=1k=1) up to the time step just before the current one (t1t-1).

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences