Concept

Prefilling as an Encoding Process

The prefilling phase can be conceptualized as an encoding process, even though its underlying mechanism is based on token prediction. The primary objective during this phase is not to generate output tokens, but rather to construct a contextual representation of the input sequence in the form of the Key-Value (KV) cache. This cache is then used to condition the subsequent token generation in the decoding phase.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related