Concept

Balancing Efficiency and Accuracy with Beam Width (K)

The selection of the beam width parameter, K, in beam search requires balancing search efficiency with output accuracy. A larger K allows the algorithm to explore more candidate sequences, which can improve accuracy but at a higher computational cost. Conversely, an excessively large K may not provide significant benefits. For LLM inference tasks, practical experience shows that smaller values, such as K=2 or K=4, often achieve a satisfactory level of performance efficiently.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Learn After