Concept

Prediction Result Analysis (Does Time Matter? Modeling the Effect of Time in Bayesian Knowledge Tracing)

  • The prediction performance of the three models were calculated in terms of Residuals and AUC values between predictions and actual responses on same day events, new day events as well as overall events of the whole problem set. The model with higher AUC values for a problem set was deemed to be the more accurate predictor of that problem set. In addition, a two-tailed paired t-test was calculated between KT and KT-Forget and KT and KT-Slip.

  • They first applied this to the datasets collected from Cognitive Tutor. Generally, the new KT-Forget model performed better on both the residuals and AUC compared to the regular KT model. Inversely, the KT-Slip model performed worse than expected. For the KT-Forget model, improved results were obtained both on residuals and AUC. Especially for the AUC, although KT-forget did not get significant improvement on new day events in terms of AUC (p value is 0.5175); however, it got significant improvement on same day events prediction and overall prediction (p value is 0.0178 and 0.0129), which means the performance of KT-Forget model is more accurate on predicting of Cognitive Tutor data compared to the regular KT model. Moreover, the better prediction performance also supported the hypothesis that students probably forget knowledge when it comes to a new day. For the KT-Slip model, the results of overall data’s AUC were worse but not significantly compared to regular KT. However, both same day and new day AUC were significantly worse, which overthrew our assumption that students may slip when it comes to a new day. Similarly, they applied their models to the ASSISTments datasets.

  • It can be observed that the new models, both KT-Forget and KT-Slip lost to the regular KT model, especially on the AUC. They looked into the reason why the new models perform much worse and found that the way the data was collected lead to this result. As they mentioned in the previous section, students are forced to leave the tutor after a certain number of questions have been finished in one day and will come back to the tutor in a new day. Thus, we observed that the datasets collected from ASSISTments have much fewer new day events (average 1 per student) and is not as amenable to a time analysis as the Cognitive Tutor data which has many new days per student and students experience the new day more naturally. Therefore, the results obtained from Cognitive Tutor are more practical for this analysis.

0

1

Updated 2021-01-23

Tags

Data Science