Concept

Truncated Natural Policy Gradient

Truncated Natural Policy Gradient (TNPG) is the algorithm for computing gradient direction. It only needs to calculate I(θ)vI(\theta)v where:

  1. I(θ)I (\theta) is Fisher Information Matrix
  2. v is arbitrary vector.

TNPG improves natural policy gradient but it has a high computational cost. Usually, they are efficient when the space of the parameters is high dimensional.

0

1

Updated 2020-10-27

Tags

Data Science