Truncated Natural Policy Gradient
Truncated Natural Policy Gradient (TNPG) is the algorithm for computing gradient direction. It only needs to calculate where:
- is Fisher Information Matrix
- v is arbitrary vector.
TNPG improves natural policy gradient but it has a high computational cost. Usually, they are efficient when the space of the parameters is high dimensional.
0
1
Tags
Data Science
Related
Background (Accelerating Human Learning With Deep Reinforcement Learning)
Spaced Repetition
Leitner System
Supermemo System
Reinforcement Learning
Intelligent Tutoring Systems (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)
Relation between Tutoring Systems and Student learning
Trust Region Policy Optimization
Truncated Natural Policy Gradient
Recurrent Neural Network (RNN)