Beam Size in Beam Search
The most straightforward version of beam search relies on a single hyperparameter called the beam size, denoted as . The beam size specifies the exact number of top candidate sequences that the algorithm maintains at each generation time step, providing a flexible trade-off between the accuracy of the output and the overall computational cost.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
D2L
Dive into Deep Learning @ D2L
Related
Top-K Token Selection in Beam Search
A text generation model is creating a sequence of words. It uses a search process that keeps track of the 2 most probable sequences at each step. The score for a sequence is the sum of the log-probabilities of its words. Given the state of the search below, which two sequences will be kept for the next step?
Step 1: The initial two sequences being tracked are:
- Sequence 1: "The" (Score: -0.5)
- Sequence 2: "A" (Score: -0.9)
Step 2: The model calculates the log-probabilities for the next possible words for each sequence:
- Expanding "The":
- "cat": -0.8
- "dog": -1.1
- Expanding "A":
- "mouse": -0.2
- "lion": -1.5
Analyzing Search Algorithm Behavior
Diagnosing a Flaw in Sequence Generation
You are tuning decoding for an internal "meeting-n...
You’re deploying an LLM to draft customer-facing i...
You’re building an internal “RFP response drafter”...
You’re implementing an LLM feature that generates ...
Post-incident analysis: fixing repetition and truncation by tuning decoding
Debugging Decoding: Balancing Determinism, Diversity, and Length in a Regulated Product
Selecting and Justifying a Decoding Policy for Two Production Use Cases
Choosing a Decoding Configuration Under Latency, Diversity, and Length Constraints
Release-readiness decision: decoding configuration for a customer-facing summarization feature
Decoding policy decision for a multilingual support assistant under safety, latency, and verbosity constraints
Computational Cost of Beam Search in Sequence-to-Sequence Models
Beam Size in Beam Search
Learn After
Balancing Efficiency and Accuracy with Beam Width (K)
An engineer is using a text generation model that employs a search algorithm where a parameter,
K, determines the number of top candidate sequences kept at each step. The engineer observes that withK=1, the generated text is often repetitive and predictable. To improve the diversity and potential quality of the output, which of the following adjustments toKis the most logical next step?Analyzing Generation Algorithm Performance
Analyzing Parameter Impact on Text Generation
Process of Beam Search