Learn Before
Candidate Set in Sampling-Based Decoding
In sampling-based decoding methods, the set of candidate sequences at step i, denoted as , is a singleton set. It contains only the single sequence that was formed by extending the previous sequence with the token sampled at the current step. This is formally expressed as:

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Candidate Set in Sampling-Based Decoding
In an autoregressive text generation process, the sequence generated up to a certain point is
The dog chased the. At the current step, the model generates and selects the tokenball. What is the new, extended sequence that will be used as the basis for generating the subsequent token?An autoregressive model is generating a sequence. It begins with the single token
y_1= 'The'. In the next step, it samples the tokenȳ_2= 'cat'. Following that, it samples the tokenȳ_3= 'sat'. What is the resulting sequence that is formed after these two sampling steps?Formal Representation of Sequence Extension
Learn After
During an autoregressive text generation process using a sampling-based method, the model has produced the sequence 'The sun is shining and the'. At the current step, the token 'sky' is sampled from the model's output distribution. Based on this single event, what is the complete set of candidate sequences that will be used as the basis for generating the next token?
In an autoregressive text generation process where a single token is chosen by sampling from a probability distribution at each step, the algorithm must keep track of several competing sequences simultaneously to decide which one to extend in the next step.
Candidate Set Composition in Sampling-Based Decoding