In a sequence generation process, the set of candidate sequences at step i, denoted Y_i, is generated from the previous set Y_{i-1} and the entire vocabulary V. Consider the difference between two methods for generating Y_i:
Method A: Y_i = Y_{i-1} × V
Method B: Y_i = Prune(Y_{i-1} × V)
What is the most significant practical difference in the outcome of using Method B instead of Method A, particularly for generating longer sequences?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a sequence generation process, the set of candidate sequences at step
i, denotedY_i, is generated from the previous setY_{i-1}and the entire vocabularyV. Consider the difference between two methods for generatingY_i:Method A:
Y_i = Y_{i-1} × VMethod B:Y_i = Prune(Y_{i-1} × V)What is the most significant practical difference in the outcome of using Method B instead of Method A, particularly for generating longer sequences?
Diagnosing a Failing Sequence Generation Algorithm
Applying Pruning in Sequence Generation