Concept

Heuristic Search Algorithms for LLM Inference

Due to the computational infeasibility of an exhaustive search across an exponentially large space of sequences, LLM inference relies on heuristic algorithms. These practical methods use pruning strategies to navigate the vast output space efficiently. Pruning involves avoiding the exploration of low-quality or improbable sequences at each decoding step, which trades guaranteed optimality for a manageable computational cost. Common heuristic techniques include greedy search and sampling-based search.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related