Concept

Trajectory Generation as a Markov Decision Process

The process of generating a sequence, or trajectory (τ), can be formally modeled as a Markov Decision Process (MDP). This framework is essential for applying reinforcement learning to sequential tasks, as it defines the states, actions, and transition probabilities that govern the generation of trajectories under a given policy.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences