Formula for Constructing Top-K Candidate Sequences
In the beam search algorithm, the top candidate sequences for step are generated by extending the previous step's prefixes with the newly selected tokens. Each new sequence is formed by appending a top token: . The final candidate set, denoted , is then formally defined as , where represents the beam width.

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
At a certain step in a sequence generation process, the probabilities for the next token over a vocabulary V = {'A', 'B', 'C', 'D', 'E'} are as follows: Pr('A')=0.1, Pr('B')=0.4, Pr('C')=0.05, Pr('D')=0.3, Pr('E')=0.15. If the selection process is defined by the function
argTopKwith K=3, which set of tokens will be selected?Analyzing a Formalism for Token Selection
Construction of Top-K Candidate Sequences in Beam Search
Formula for Constructing Top-K Candidate Sequences
Evaluating a Token Selection Implementation
Construction of the Optimal Sequence in Greedy Search
An autoregressive model generates the token sequence , where , , and so on. What does the notation represent in this specific sequence?
True or False: For an autoregressive model generating the output sequence , the notation represents the complete subsequence .
Formula for Constructing Top-K Candidate Sequences
An autoregressive language model is generating a sequence of tokens, one at a time. To predict the fifth token in the sequence, denoted as , the model uses all the previously generated tokens as context. The standard notation for this preceding subsequence of tokens is ____.
Learn After
A language model is generating a sequence of tokens. The sequence generated so far is
[501, 243, 988]. At the current step, the model has identified the 3 most probable next tokens as[104, 675, 312]. Based on this information, what is the resulting set of new candidate sequences?Deconstructing Candidate Sequences
Formula for the Candidate Set in Top-K Decoding
A language model is generating text using a top-K decoding strategy. Arrange the following steps in the correct order to describe how a single new candidate sequence is constructed from a given preceding sequence and a set of top-K next tokens.