Multiple Choice

An agent in an environment completes a sequence of two actions. It starts in an initial state s₀, performs action a₀ to reach state s₁, and then performs action a₁ to reach the final state s₂. Which of the following notations correctly represents the full sequence of state-action pairs, often called a trajectory (τ)?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science