Analyzing Suboptimal Text Generation
An autoregressive language model generates text by selecting the single most probable token at each step, based on the sequence generated so far. The model is given the prompt 'The best restaurant in town is known for its delicious food and' and generates the next token 'a', resulting in the sequence '...food and a'. A human might have preferred a completion like '...food and amazing atmosphere'.
Explain how the model's step-by-step selection process could lead to the choice of 'a', even though it results in a less coherent overall sentence.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An autoregressive language model generates text one token at a time. At each step, it chooses the single token with the highest conditional probability based on the entire sequence generated so far. The model starts with the context 'The dog' and must choose the next two tokens.
Given the following table of conditional probabilities, which sequence of two tokens will the model generate?
Current Context Next Token Probability 'The dog' 'barked' 0.7 'The dog' 'ran' 0.2 'The dog' 'ate' 0.1 'The dog barked' 'loudly' 0.9 'The dog barked' 'at' 0.1 'The dog ran' 'away' 0.6 'The dog ran' 'to' 0.4 An autoregressive model generates a sequence of three tokens after an initial start token,
<s>. It does this by selecting the single most probable token at each step based on the sequence generated so far. Arrange the following actions into the correct chronological order that the model follows.Analyzing Suboptimal Text Generation