Learn Before
Activity (Process)

Process of Generating Prefix Caches

The generation of prefix caches involves processing input sequences, often sourced from a representative dataset, through a process analogous to the standard prefilling phase. For any given sequence, the system computes and stores the Key-Value (KV) cache state for each of its constituent prefixes. This creates a collection of mappings, where each unique prefix is associated with its corresponding hidden state, ready for later reuse.

Image 0

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related