Concept

Experimental Setup (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

  1. Number of items : 30
  2. Number of runs: 10
  3. Number of episodes per run: 100
  4. Number of steps per episode: 200
  5. Delay between steps: 5s
  6. Four baseline policies: Leitner, SuperMemo, Random, Threshold
  7. EFC θ\theta sampled from a log-normal distribution $\log\theta \sim N \log (0.077, 1)
  8. HLR θ=(1,1,0,θ3N(0,1))\overrightarrow {\rm \theta} = (1, 1, 0, \theta_3 \sim N(0, 1)) xix_i = (num attempts, num correct, num incorrect, one-hot encoding of item i out of n items).
  9. GPL student ability a=α=0a = \overrightarrow {\rm \alpha} = 0 sample item difficulties dN(1,1)d \sim N(1, 1) and logdN(1,1)\log \overrightarrow {\rm d} \sim N(1, 1) delay coefficient llogrN(0,0.001)\log r \sim N(0, 0.001), window coefficients θ2w=θ2w1=1Ww+1\theta_{2w} = \theta_{2w} - 1 = \frac {1} {\sqrt W - w + 1} and number of windows 5.
  10. TRPO and TNPG settings were the same
  11. LSTM: 20 units, dense unit with sigmoid, Adam optimizer. 2 hidden states.
  12. DRL
Image 0

0

1

Updated 2020-10-29

Tags

Data Science

Related