1Cademy - Observed Relation Between Rewards and Thresholds (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

Learn Before

Results (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

Relation

Observed Relation Between Rewards and Thresholds (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

The researchers analyzed which methods generated the maximum rewards. The threshold value was varied between 0 and 1, and the resulting reward values were recorded to observe changes in the reward function. For the EFC student model, the highest reward values occurred when the threshold was between 0.2 and 0.4 using a likelihood-based reward function. This suggests that student learning is most efficient when the probability of answering correctly is between 0.2 and 0.4. When a log-likelihood-based reward function was used, reward values decreased as the threshold increased. For the HLR student model, the plots of reward values with respect to the threshold resembled a step function, indicating optimal reward values when the threshold was in the range of 0 to 0.5. For the GPL (DASH) model with a likelihood reward function, significant noise was observed in the reward values as the threshold increased. However, with the log-likelihood function, the reward values decreased as the threshold increased, similar to the tendency observed in the EFC student model.

0

1

Updated 2026-07-06

Contributors are:

NL

Nineli Lashkarashvili

Who are from:

San Diego State University

Learn Before

Related