Concept

Architectural Adaptation of LLMs for Long Sequences

To overcome the challenges of processing long sequences, the architecture of Large Language Models is evolving. Driven by issues like the quadratic time complexity of self-attention and the significant memory footprint of the KV cache, model design is shifting away from the standard Transformer towards more efficient variants and alternative architectures.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.3 Prompting - Foundations of Large Language Models

Ch.4 Alignment - Foundations of Large Language Models

Ch.5 Inference - Foundations of Large Language Models

Related