1Cademy - Prefilling-Decoding Frameworks

Learn Before

Core Topics in LLM Inference
Purpose of Defining Notation for LLM Inference

Concept

Prefilling-Decoding Frameworks

The prefilling-decoding framework is the most widely adopted structure for interpreting and implementing the inference process in Large Language Models. This framework provides a comprehensive model that includes essential components of inference, such as the various decoding algorithms used to generate output.

Updated 2026-05-03

Contributors are: