Learn Before
Definition

Index Set of Non-Zero Attention Weights (GG)

In sparse attention, the set GG denotes the specific subset of indices for which the attention weights are non-zero and will be computed. For a given token at position ii in a causal model, this set is a subset of all preceding positions, formally expressed as G{0,,i}G \subseteq \{0, \dots, i\}. This set effectively defines the sparsity pattern by identifying which key-value pairs the current query will attend to.

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences