Truncated Natural Policy Gradient
Truncated Natural Policy Gradient (TNPG) is the algorithm for computing gradient direction. It only needs to calculate where:
- is Fisher Information Matrix
- v is arbitrary vector.
TNPG improves natural policy gradient but it has a high computational cost. Usually, they are efficient when the space of the parameters is high dimensional.
0
1
Tags
Data Science
Related
Background (Accelerating Human Learning With Deep Reinforcement Learning)
Spaced Repetition
Leitner System
Reinforcement Learning
Intelligent Tutoring Systems (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)
Relation between Tutoring Systems and Student learning
Trust Region Policy Optimization
Truncated Natural Policy Gradient
Recurrent Neural Network (RNN)
SuperMemo 2 (SM-2) Algorithm