1Cademy - Working Mechanism of DKVMN (Deep-IRT: Make Deep Learning Based Knowledge Tracing Explainable Using Item Response Theory)

Learn Before

Deep Item Response Theory (Deep-IRT: Make Deep Learning Based Knowledge Tracing Explainable Using Item Response Theory)

Concept

Working Mechanism of DKVMN (Deep-IRT: Make Deep Learning Based Knowledge Tracing Explainable Using Item Response Theory)

"DKVMN model works as follows: at time t, it first receives a KC $q_t$ , then predicts the probability of answering $q_t$ correctly, and eventually updates the memory using the question-and-answer interaction ( $q_t$ , $a_t$ )." We can think that there are for Q different knowledge components (KCs) we have N latent concepts. These latent concepts are in key memory - $M^k \in \R ^{\N \times d_k}$ . Here $d_k$ denotes the embedding size of key memory slot. Knowledge states are stored in value memory: $M^v \in R^{N \times d_v}$ . Here $d_v$ also denotes the embedding size but in this case of value memory slot. DKVMN has three major steps:

Getting Attention Weight Here firstly, $q_t$ is extracted from knowledge components embedding matrix, then it is used as $k_t$ to search for the key in key memory matrix and finally we get weighing which measures how much attention should be paid for each value in the value memory matrix: $w_{ti} = Softmax(M_i^k k_t)$ and $\sum_i ^N w_{ti} = 1$ , $M_{ik}$ is i-th row vector, $w_{ti}$ is i-th element from weight vector.
Making Prediction Here firstly we read the latent knowledge state in the value memory $M_t^v$ in order to create read vector: $r_t = \sum_{i=1}^N w_{ti} (M_{ti}^v)^T$ After this operation read and KC embedding( $k_t$ ) vectors are concatenated and are used to generate feature vector, which in turn is used to calculate the probability of the student answering $q_t$ knowledge component correctly: $f_t = tanh(W_f [r_t, k_t] + b_f)$ $p_t = P(\alpha_t) = \sigma (W_p f_t + b_p)$ Both of this functions are applied element-wise and W and b are weight matrix and bias vector.
Updating Value Memory Here we firstly retrieve $(q_t, \alpha_t)$ embedding vector from KC-embedding matrix. This embedding vector $(v_t)$ is a representation of the knowledge growth after working with $q_t$ along with correct label $\alpha_t$ . In the update operation part of the memory is removed before we would add new information.

$e_t = \sigma(W_ev_t + b_e)$ $a_t = tanh(W_av_t + b_a)$ $\sim M_{t+1, i}^v = M_{ti}^v \otimes (1 - w_{ti}e_t)^T$ $M_{t+1, i}^v = \sim M_{t+1, i}^v + w_{ti}a_t^T$

0

1

Updated 2020-11-17

Contributors are:

Nineli Lashkarashvili

🏆 1

Who are from:

San Diego State University

🏆 1

References

Reference for Deep-IRT: Make Deep Learning Based Knowledge Tracing Explainable Using Item Response Theory

Learn Before

Related