Learn Before
Debugging a Text Generation System
An engineer is debugging a text generation model that uses a search algorithm to build sentences. The model is producing very predictable and often repetitive outputs. For example, when prompted to complete 'The weather today is...', it consistently generates 'The weather today is nice. The weather today is nice.' Upon inspecting the generation process, the engineer notes that at each step, only the single most probable next word is ever considered to extend the current sequence.
Based on this observation, what specific aspect of the token selection process is likely causing this issue, and how should it be adjusted to encourage more diverse and potentially higher-quality outputs? Explain your reasoning.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Construction of Top-K Candidate Sequences in Beam Search
Mathematical Definition of Top-K Token Selection
A language model is generating text using a search algorithm. At a certain step, it has the partial sequence 'The cat sat on the' and calculates the following probabilities for the next word from its vocabulary:
Word Probability mat 0.45 rug 0.25 chair 0.15 floor 0.10 table 0.03 window 0.02 If the algorithm is configured to select the 3 most probable next words at this step, which set of words will be chosen to create new candidate sequences?
Debugging a Text Generation System
A text generation system is designed to explore multiple possible sentence continuations at each step. It does this by selecting a fixed number of the most probable next words from its entire vocabulary. Match each parameter setting or concept with its most likely consequence or definition.