Learn Before
Diagnosing a Flaw in Sequence Generation
A text generation model is tasked with producing a summary. It explores several candidate summaries and calculates a score for each by summing the log-probabilities of its words. The model's goal is to output the sequence with the highest score. Review the two final candidates below and explain the fundamental flaw in this scoring method that leads the model to select the suboptimal summary. Then, describe the general principle of a technique that could correct for this flaw.
0
1
Tags
Data Science
Foundations of Large Language Models Course
Computing Sciences
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Beam Width (K)
Top-K Token Selection in Beam Search
A text generation model is creating a sequence of words. It uses a search process that keeps track of the 2 most probable sequences at each step. The score for a sequence is the sum of the log-probabilities of its words. Given the state of the search below, which two sequences will be kept for the next step?
Step 1: The initial two sequences being tracked are:
- Sequence 1: "The" (Score: -0.5)
- Sequence 2: "A" (Score: -0.9)
Step 2: The model calculates the log-probabilities for the next possible words for each sequence:
- Expanding "The":
- "cat": -0.8
- "dog": -1.1
- Expanding "A":
- "mouse": -0.2
- "lion": -1.5
Analyzing Search Algorithm Behavior
Diagnosing a Flaw in Sequence Generation
You are tuning decoding for an internal "meeting-n...
Youâre deploying an LLM to draft customer-facing i...
Youâre building an internal âRFP response drafterâ...
Youâre implementing an LLM feature that generates ...
Post-incident analysis: fixing repetition and truncation by tuning decoding
Debugging Decoding: Balancing Determinism, Diversity, and Length in a Regulated Product
Selecting and Justifying a Decoding Policy for Two Production Use Cases
Choosing a Decoding Configuration Under Latency, Diversity, and Length Constraints
Release-readiness decision: decoding configuration for a customer-facing summarization feature
Decoding policy decision for a multilingual support assistant under safety, latency, and verbosity constraints