Concept

Question Encoder and Knowledge Encoder (Context-Aware Attentive Knowledge Tracing)

The question encoder takes raw question embeddings {x1,...,xt}\{x_1, . . . , x_t \} as input and outputs a sequence of context-aware question embedding {x^1,...,x^t}\{\hat{x}_1, . . ., \hat{x}_t \} using a monotonic attention mechanism. The knowledge encoder takes raw question-response embeddings {y1,...,yt1}\{y_1, . . . , y_{t−1}\} as input and outputs a sequence of actual knowledge acquired {y^1,...,y^t1}\{\hat{y}_1, . . ., \hat{y}_{t-1} \} using the same monotonic attention mechanism.

We use a modified, monotonic version of the scaled dot-product attention mechanism for the encoders.A multiplicative exponential decay term is added to the attention scores as: αt,τ=exp(st,τ)τexp(st,τ) \alpha_{t,τ} = \frac{ exp(s_{t,τ}) }{ \sum_{τ'}exp(s_{t,τ')} } with st,τ=exp(θd(t,τ))qtkτDks_{t,τ} = \frac{ exp (−θ · d(t , τ )) · q^⊺_t k_τ }{ \sqrt{D_k} } where θ > 0 is a learnable decay rate parameter and d(t,τ)d(t , τ ) is a context-aware distance measure between time steps tt and ττ. The context-aware distance measure uses another softmax function to adjust the distance between consecutive time indices according to how the concept practiced in the past is related to the current concept.

In summary, the monotonic attention mechanism takes the basic form of an exponential decay curve over time with possible spikes at time steps when the past question is highly similar to the current question.

0

1

Updated 2021-01-16

Tags

Data Science