Learn Before
Beam Width (K)
In the beam search algorithm, the beam width, denoted by the parameter K, specifies the number of top candidate sequences that are maintained at each step of the generation process.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Beam Width (K)
Top-K Token Selection in Beam Search
A text generation model is creating a sequence of words. It uses a search process that keeps track of the 2 most probable sequences at each step. The score for a sequence is the sum of the log-probabilities of its words. Given the state of the search below, which two sequences will be kept for the next step?
Step 1: The initial two sequences being tracked are:
- Sequence 1: "The" (Score: -0.5)
- Sequence 2: "A" (Score: -0.9)
Step 2: The model calculates the log-probabilities for the next possible words for each sequence:
- Expanding "The":
- "cat": -0.8
- "dog": -1.1
- Expanding "A":
- "mouse": -0.2
- "lion": -1.5
Analyzing Search Algorithm Behavior
Diagnosing a Flaw in Sequence Generation
You are tuning decoding for an internal "meeting-n...
Youâre deploying an LLM to draft customer-facing i...
Youâre building an internal âRFP response drafterâ...
Youâre implementing an LLM feature that generates ...
Post-incident analysis: fixing repetition and truncation by tuning decoding
Debugging Decoding: Balancing Determinism, Diversity, and Length in a Regulated Product
Selecting and Justifying a Decoding Policy for Two Production Use Cases
Choosing a Decoding Configuration Under Latency, Diversity, and Length Constraints
Release-readiness decision: decoding configuration for a customer-facing summarization feature
Decoding policy decision for a multilingual support assistant under safety, latency, and verbosity constraints
Learn After
Balancing Efficiency and Accuracy with Beam Width (K)
An engineer is using a text generation model that employs a search algorithm where a parameter,
K, determines the number of top candidate sequences kept at each step. The engineer observes that withK=1, the generated text is often repetitive and predictable. To improve the diversity and potential quality of the output, which of the following adjustments toKis the most logical next step?Analyzing Generation Algorithm Performance
Analyzing Parameter Impact on Text Generation