Analyzing Generation Processes
A team is developing two different text generation models, Model A and Model B. Both are given the same input phrase, "The sky is".
- Model A calculates the probability for the next word (e.g., "blue") based on the full input phrase "The sky is".
- Model B calculates the probability for the next word (e.g., "blue") based on the full input phrase "The sky is" AND any words it has already generated in the current output sequence.
One of these models represents a fundamentally flawed approach for generating coherent, multi-word sequences. Identify which model is flawed and explain why its method for determining the probability of the next word is problematic for generating longer, sensible sentences.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model generates an output sequence one token at a time, where each new token's probability depends on prior information. If the model has already produced the first three tokens of an output based on a given input sequence, which of the following best describes the complete set of information used to calculate the probability for the fourth token?
Analyzing Generation Processes
Analyzing a Translation Model's Error