Learn Before
Analyzing a Formalism for Token Selection
A researcher attempts to formalize the process of selecting the K most probable next tokens with the following expression: S = argMax_{y_i ∈ V} Pr(y_i|x, y_{<i}), where S is the set of selected tokens and V is the vocabulary. Identify the primary error in this formulation for a scenario where K > 1, and explain why it is incorrect.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
At a certain step in a sequence generation process, the probabilities for the next token over a vocabulary V = {'A', 'B', 'C', 'D', 'E'} are as follows: Pr('A')=0.1, Pr('B')=0.4, Pr('C')=0.05, Pr('D')=0.3, Pr('E')=0.15. If the selection process is defined by the function
argTopKwith K=3, which set of tokens will be selected?Analyzing a Formalism for Token Selection
Construction of Top-K Candidate Sequences in Beam Search
Formula for Constructing Top-K Candidate Sequences
Evaluating a Token Selection Implementation