Learn Before
Concept

Markov Process

A Markov Process is a tuple (S,P), where • S is a (finite) set of states • P is a state transition probability matrix. Pss=P[St+1=sSt=s]P_{ss'} = P[S_{t+1}=s'|S_t = s].

We make some constrains on states. A sequence of states is Markov if and only if the probability of moving to the next state St+1S_{t+1} depends only on the present state StS_t and not on the previous states S1,S2,,St1S_1, S_2, · · · , S_{t−1}. That is, for all t, P[St+1St]=P[St+1S1,S2,,St].P[S_{t+1}|S_t] = P[S_{t+1}|S_1, S_2, · · · , S_t]. In reinforcement learning, Markov Process is time-homogeneous. That is, the probability of the transition is independent of t: P[St+1=sSt=s]=P[St=sSt1=s]P[S_{t+1} = s'|S_t = s] = P[S_t = s'|S_{t−1} = s].

0

1

Updated 2025-08-31

Tags

Data Science

Learn After