Greedy Search (Greedy Decoding)
Greedy search, also known as greedy decoding, is one of the most widely used decoding algorithms in natural language processing tasks, such as machine translation. The straightforward idea behind this method is to make locally optimal decisions at each generation step by selecting the next token that has the highest prediction probability. By continually picking the single most likely token, the process sequentially evaluates candidate sequences using their overall log-probability .

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Greedy Search (Greedy Decoding)
Beam search
A language model is generating a sequence of tokens. The total log-probability for the partially generated sequence 'The quick brown' has been calculated as -3.5. In the very next step, the model computes the conditional log-probability for the token 'fox' as -1.2. What is the new total log-probability for the complete sequence 'The quick brown fox'?
A language model is generating a sequence. The table below shows the conditional log-probability for each new token and the claimed total accumulated log-probability for the sequence up to that point. Analyze the table to identify the first step where the total accumulated log-probability is calculated incorrectly based on the principle of incremental summation.
Step Token Conditional log-prob Total Accumulated log-prob 1 'The' -0.9 -0.9 2 'cat' -1.5 -2.4 3 'sat' -1.1 -2.6 Comparing Generation Paths
Greedy Search (Greedy Decoding)
A company is developing two applications using a language model. Application A is a tool for generating formal, standardized financial reports where it is critical that the same input data always produces the exact same summary. Application B is a creative writing assistant designed to help authors brainstorm diverse plot ideas. Which application is a more suitable use case for a deterministic decoding algorithm, and why?
Chatbot Performance Analysis
Evaluating Decoding Strategies for Conversational AI
Greedy Search (Greedy Decoding)
Formula for Pruned Step-wise Expansion of the Hypothesis Set
A language model is generating a sentence and must decide on the next word. It has identified 100 possible words, each with an associated probability. To manage computational resources, the model employs a strategy that discards all but the top 5 most probable words before considering the subsequent step. Which of the following statements best analyzes the primary trade-off inherent in this strategy?
Analyzing Text Generation System Performance
Rationale for Decoding Heuristics
Learn After
Mathematical Justification for Greedy Search
Construction of the Optimal Sequence in Greedy Search
Candidate Set in Greedy Search
A language model is generating a two-token sequence. At the first step, it calculates the probability for the next token: 'Token A' has a probability of 0.6, and 'Token B' has a probability of 0.4. If the model chooses 'Token A', the most probable subsequent token is 'Token C' (with a conditional probability of 0.5). If the model had chosen 'Token B', the most probable subsequent token would be 'Token D' (with a conditional probability of 0.9). A text generation algorithm is used that, at every step, commits to the single token with the highest immediate probability. Based on this process, which sequence will be generated and why?
Algorithm Suitability for Text Generation Tasks
When generating a sequence of text, an algorithm that selects the single most probable token at each step is guaranteed to produce the overall most probable sequence.
Analyzing Suboptimal Outcomes in Text Generation
Selecting and Justifying a Decoding Policy for Two Production Use Cases
Debugging Decoding: Balancing Determinism, Diversity, and Length in a Regulated Product
Post-incident analysis: fixing repetition and truncation by tuning decoding
Choosing a Decoding Configuration Under Latency, Diversity, and Length Constraints
Release-readiness decision: decoding configuration for a customer-facing summarization feature
Decoding policy decision for a multilingual support assistant under safety, latency, and verbosity constraints
You are tuning decoding for an internal "meeting-n...
You’re implementing an LLM feature that generates ...
You’re building an internal “RFP response drafter”...
You’re deploying an LLM to draft customer-facing i...
Beam search