Theory

Model Description (Deep Attentive Study Session Dropout Prediction in Mobile Learning Environment)

The proposed model has a similar structure to the transformer, consisting of nn stacked encoders and decoders. The encoder consists of self-attention and feedforward layers; after each of these layers, there is a residual connection and layer normalization. Similarly, decoders contain the same layers with an additional fully connected layer for prediction. The encoder and decoders have distinctive inputs: for the encoder we have eie_is (question metadata features), and for the decoders we have response features shifted by one ((S,l1,,ln1)(S, l_1, \cdots, l_{n-1})) and the output of the encoder. In their model, the authors use MultiHeadAttention and upper triangular masking.

0

1

Updated 2026-05-09

Tags

Data Science