Learn Before
Forming the Candidate Pool in Top-p Sampling
The candidate pool for top-p sampling is created through a two-step process. First, all potential next tokens are sorted by their predicted probabilities in descending order. Second, starting with the highest-probability token, tokens are cumulatively added to the pool until their combined probability meets or exceeds the predefined threshold 'p'.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Candidate Pool Size in Top-p Sampling (kp)
Forming the Candidate Pool in Top-p Sampling
A language model is generating text and has calculated the following probabilities for the next possible token: 'the' (0.45), 'a' (0.25), 'one' (0.15), 'it' (0.10), 'she' (0.05). If the model uses a sampling strategy with a probability threshold of
p = 0.8, which set of tokens will form the final candidate pool (the 'nucleus') from which the next token is actually sampled?A language model is configured to generate text by sampling from the smallest set of tokens whose cumulative probability exceeds a predefined threshold 'p'. Arrange the following steps of this process in the correct chronological order.
Applying the Top-p Sampling Process
Learn After
A language model is generating the next word and has calculated the following probabilities for the most likely tokens:
{'the': 0.40, 'a': 0.25, 'one': 0.15, 'it': 0.10, 'is': 0.05}. If the model uses a probability threshold ofp = 0.70to create a candidate pool for sampling, which set of tokens will be included in that pool?A text generation model needs to create a candidate pool of tokens for its next selection based on a cumulative probability threshold. Arrange the following actions in the correct chronological order to accurately construct this pool.
Determining the Probability Threshold