1Cademy - Background<br>(Accelerating Human Learning With Deep Reinforcement Learning)

Learn Before

Accelerating Human Learning with Deep Reinforcement Learning
Related Theory (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

Concept

Background (Accelerating Human Learning With Deep Reinforcement Learning)

Here the authors describe three previous probabilistic methods of the human memory:

Exponential Forgetting Curve - one of the oldest human memory models. The authors use parameterization where user either forgets information completely or retains it and the probability of the recall can be expressed with this function:

"where θ ∈ R+ is the item difficulty, D ∈ R+ is the time elapsed since the item was last reviewed by the student, and S ∈ R+ is the student’s memory strength for the item." S is set to the amount of trials.

                                   $$Z ~ Bernoulli(exp(-\theta \frac {D}{S}))$$

Half-Life Regression - here unlike exponential forgetting curve we don't have the parameter for the difficulty here, instead here we have ~x ∈ X which contains information about student's study history and model parameters θ ∈ Θ . This is how formula looks like: $S = exp(\overrightarrow { \rm \theta} \overrightarrow {\rm x})$

In order to encode the number of attempts, correct/incorrect answers and the identity the authors set $X = \N ^{3} \times {0, 1} ^{n}$ . By dropping difficulty of the item we are not losing any information as the difficulty "is absorbed into the memory strength via the coefficients of the item identity indicator features."

Generalized Power Law (paper describes this part precisely, couldn't be broken down into smaller parts) here we have different likelihood for the recall:

On the last formula from the picture we have a and d which are parameters for the student ability and the item difficulty.