1Cademy - Calculating State Value in a Deterministic Environment

Learn Before

State-Value Function (V) Formula

Short Answer

Calculating State Value in a Deterministic Environment

An agent operates in an environment with three states: A, B, and C. The agent starts in state A and follows a fixed, deterministic policy: from state A, it always moves to state B, and from state B, it always moves to state C. State C is a terminal state, ending the process.

The rewards for these transitions are as follows:

The transition from A to B yields a reward of +2.
The transition from B to C yields a reward of +10.

Using a discount factor (γ) of 0.9, calculate the value of the starting state A. Show your calculation based on the formula: $V(s) = \mathbb{E}[\sum_{t=0}^{\infty} \gamma^t r_t | s_0 = s, \pi]$ .

0

1

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related