1Cademy - Notational Variations in State-Action Sequences (Trajectories)

Learn Before

Definition

Notational Variations in State-Action Sequences (Trajectories)

A state-action sequence, or trajectory (τ), documents the path an agent takes through an environment. While the core concept is consistent, the notation used to represent these sequences can vary. For instance, a trajectory may be denoted as starting from time step 1, such as τ = {(s₁, a₁), (s₂, a₂), ...}, often to align with other notation in a specific context, like sequence prediction. Alternatively, it is common in reinforcement learning literature to see trajectories starting from time step 0, with varying lengths, such as τ = {(s₀, a₀), ..., (sT, aT)} or τ = {(s₀, a₀), ..., (sT-₁, aT-₁)}. These notational differences are a matter of convention and do not alter the fundamental principles or models being discussed.

0

1

Updated 2025-10-07

Contributors are:

Who are from:

References

Learn Before

Related

Learn After