A language model generates an output sequence one token at a time, where each new token's probability depends on prior information. If the model has already produced the first three tokens of an output based on a given input sequence, which of the following best describes the complete set of information used to calculate the probability for the fourth token?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model generates an output sequence one token at a time, where each new token's probability depends on prior information. If the model has already produced the first three tokens of an output based on a given input sequence, which of the following best describes the complete set of information used to calculate the probability for the fourth token?
Analyzing Generation Processes
Analyzing a Translation Model's Error