Concept

Implementation (Deep-IRT: Make Deep Learning Based Knowledge Tracing Explainable Using Item Response Theory)

  1. Experimental Setting

Input (qtq_t, ata_t) are fed using their ID tags: ID(qt)1,...,QID(q_t) \in {1, ... , Q} and ID(qt,at)=qt+atQ1,...,2QID(q_t, a_t) = q_t + a_t Q \in {1, ... , 2Q} All the parameters (KC matrix A, MkM^k, MvM^v, W, b) are initialized randomly from Gaussian distribution and they are learned during training so that the cross-entropy loss is minimized: L=t(atlogpt+(1a)log(1pt)) L = - \sum_t (a_t \log p_t + (1 - a) \log (1-p_t) ) Adam optimizer was used where learning rate was set to 0.003 and the batch size was 32. The timestaps have fixed length 200 except Synthetic dataset.

  1. Hyperparameter Selection and Evaluation

30% of the dataset is used for the test set and rest is used for the training. 5-fold cross-validation is used for hyperparameter selection. Hidden layer size, the dimensions for key and value matrices are chosen from 10, 50, 100, 200. They use ROC curve to measure the performance of their model.

0

1

Updated 2020-11-17

Tags

Data Science