Concept

Causal Attention

Causal attention is a type of self-attention mechanism where a query at a specific position i can only attend to keys and values at positions less than or equal to i (K_<=i, V_<=i). This restriction, often implemented using a mask, ensures that the model's prediction for a token only depends on the preceding tokens and not on future ones. The computation is formally expressed as Att_qkv(q_i, K_<=i, V_<=i).

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences