1Cademy - Search Space Pruning in LLM Decoding

Learn Before

Computational Infeasibility of Exhaustive Search in LLM Decoding

Concept

Search Space Pruning in LLM Decoding

To manage the computationally infeasible size of the search space, practical decoding algorithms use pruning strategies. These methods work by identifying and discarding low-quality or unpromising sequences at each step of the generation process, thereby focusing computational effort on a smaller, more manageable set of high-potential candidates.