1Cademy - Heuristic Search Algorithms for LLM Inference

Learn Before

Search (Decoding) Algorithms for LLM Inference
The Search Problem in LLM Inference

Concept

Heuristic Search Algorithms for LLM Inference

Due to the computational infeasibility of an exhaustive search across an exponentially large space of sequences, LLM inference relies on heuristic algorithms. These practical methods use pruning strategies to navigate the vast output space efficiently. Pruning involves avoiding the exploration of low-quality or improbable sequences at each decoding step, which trades guaranteed optimality for a manageable computational cost. Common heuristic techniques include greedy search and sampling-based search.

Updated 2026-05-03

Contributors are: