Learn Before
At a certain step in a sequence generation process, the probabilities for the next token over a vocabulary V = {'A', 'B', 'C', 'D', 'E'} are as follows: Pr('A')=0.1, Pr('B')=0.4, Pr('C')=0.05, Pr('D')=0.3, Pr('E')=0.15. If the selection process is defined by the function argTopK with K=3, which set of tokens will be selected?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
At a certain step in a sequence generation process, the probabilities for the next token over a vocabulary V = {'A', 'B', 'C', 'D', 'E'} are as follows: Pr('A')=0.1, Pr('B')=0.4, Pr('C')=0.05, Pr('D')=0.3, Pr('E')=0.15. If the selection process is defined by the function
argTopKwith K=3, which set of tokens will be selected?Analyzing a Formalism for Token Selection
Construction of Top-K Candidate Sequences in Beam Search
Formula for Constructing Top-K Candidate Sequences
Evaluating a Token Selection Implementation