Concept

Global Nature of Standard Transformer LLMs

Large language models that utilize the standard Transformer architecture function as global models. During inference, these models are required to store the complete left-context—the entire history of previously generated tokens—in order to predict future tokens. This comprehensive storage is managed through a Key-Value (KV) cache, which retains the key and value representations of all past tokens, resulting in a caching cost that progressively increases as the generation process continues.

0

1

Updated 2026-04-22

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences