Relation

Attention-level improvements of Transformers

Improvements to the attention module can be categorized into these directions:

  1. Sparse Attention
  2. Linearized Attention
  3. Prototype and Memory Compression
  4. Low-rank Self-Attention
  5. Attention with Prior
  6. Improved Multi-Head Attention

0

1

Updated 2025-10-10

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences