Learn Before
Feasibility of Parallel Token Generation
A developer proposes modifying a standard autoregressive language model to generate all the words of a sentence at the same time, in a single step, to make it faster. Based on the fundamental principle of how these models generate output, explain why this parallel generation approach is not possible.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Latency from Sequential Dependency in Autoregressive Generation
An autoregressive model is in the process of generating the four-token sequence:
A B C D. At the specific step where it is predicting tokenD, what information serves as the context for this prediction?Feasibility of Parallel Token Generation
An autoregressive model is tasked with generating the three-token sequence 'The cat sat'. Arrange the following computational steps in the correct chronological order.