Learn Before
Top-k Selection Pool
In top-k sampling, the 'selection pool' is the set of candidate tokens from which the next token is chosen. At a given step , this pool, denoted as , contains the tokens with the highest probabilities. It is formally defined as:

0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of Top-k Sampling with k=3
Top-k Selection Pool
Probability Renormalization Formula for Restricted Vocabulary Sampling
Probability Renormalization Formula for Top-k Sampling
A language model is generating the next word in a sequence and has calculated the initial probabilities for the five most likely candidates:
the(0.4),a(0.2),one(0.1),his(0.05), andher(0.05). If the model uses a sampling strategy where it only considers the top 3 most likely candidates (k=3), what will be the new, rescaled probability distribution for this reduced set of candidates from which the final word will be sampled?Arrange the following actions into the correct sequence that describes the process of selecting the next token in a text generation model using the top-k sampling method.
Analyzing Text Generation Outputs
Learn After
Formal Derivation of the Top-k Selection Pool
A language model is generating text and has calculated the following probabilities for the next potential token:
{'the': 0.45, 'a': 0.20, 'cat': 0.12, 'dog': 0.08, 'ran': 0.07, 'jumped': 0.05}. If the model is configured to sample its next choice from only the 4 most likely candidates, which set of tokens constitutes the selection pool?Impact of Selection Pool Size on Text Generation
When generating text by sampling from a pool of the most probable candidate tokens, setting the pool size to 1 will produce the exact same output sequence as a method that always deterministically chooses the single token with the highest probability at every step.