Concept

Research Question and Objectives (Using deep reinforcement learning for personalizing review sessions on e-learning platforms with spaced repetition)

How does DRL perform compared to other baseline policies when we replace the reward from being the exact probability of recalling each item, to a realistically observable reward such as the average of the sum of correct outcomes in a sample of exercises?

What is the effect of replacing the same reward function with an RNN model that predicts the reward (reward shaping)? We used Long Short Term Memory (LSTM), a kind of recurrent neural network (RNN), which has been shown to achieve good results for sequential prediction tasks

0

1

Updated 2020-11-07

Tags

Data Science