1Cademy - Search (Decoding) Algorithms for LLM Inference

Learn Before

Core Topics in LLM Inference
Prefilling-Decoding Frameworks

Concept

Search (Decoding) Algorithms for LLM Inference

Search algorithms, also known as decoding algorithms, are foundational methods for LLM inference that navigate the vast space of possible output sequences to select the most probable one. Most of these algorithms operate through a level-by-level search process, building sequences token by token. While many of these techniques are now central to LLMs, some have their roots in earlier sequence-to-sequence models, whereas others are more recent developments specifically associated with modern large-scale models.

Updated 2025-10-07

Contributors are: