Concept

Role of Causal Attention in Autoregressive Language Models

Causal attention is fundamental to autoregressive language models, which are designed to predict the next token in a sequence based solely on the preceding tokens (the 'left-context'). The causal attention mechanism enforces this constraint by masking out future positions, ensuring that the model's output at any given position i is only influenced by tokens from positions 0 to i.

0

1

Updated 2026-01-15

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences