1Cademy - Attention-level improvements of Transformers

How it works Courses Research Communities Benefits About Us

Learn Before

Module-Level variant

Relation

Attention-level improvements of Transformers

Improvements to the attention module can be categorized into these directions:

Sparse Attention
Linearized Attention
Prototype and Memory Compression
Low-rank Self-Attention
Attention with Prior
Improved Multi-Head Attention

0

1

Updated 2025-10-10

Contributors are:

Adam Nik

Gemini AI

Who are from:

Carleton College

Carleton College

Google

References

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Related

Learn After