Learn Before
Candidate Pool Size in Top-p Sampling (kp)
In the process of top-p sampling, after tokens are sorted by probability and added to the candidate pool until their cumulative probability exceeds the threshold 'p', the resulting number of tokens in this pool is denoted as 'kp'.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Candidate Pool Size in Top-p Sampling (kp)
Forming the Candidate Pool in Top-p Sampling
A language model is generating text and has calculated the following probabilities for the next possible token: 'the' (0.45), 'a' (0.25), 'one' (0.15), 'it' (0.10), 'she' (0.05). If the model uses a sampling strategy with a probability threshold of
p = 0.8, which set of tokens will form the final candidate pool (the 'nucleus') from which the next token is actually sampled?A language model is configured to generate text by sampling from the smallest set of tokens whose cumulative probability exceeds a predefined threshold 'p'. Arrange the following steps of this process in the correct chronological order.
Applying the Top-p Sampling Process
Learn After
Mathematical Representation of the Top-p Candidate Pool
A language model is generating the next word and has calculated the following probabilities for the most likely tokens: Token A (0.40), Token B (0.30), Token C (0.15), Token D (0.10), and Token E (0.05). If the model uses a sampling strategy where it forms a candidate pool by including the most probable tokens until their cumulative probability just exceeds a threshold of 0.75, what will be the size of this candidate pool?
Relationship Between Threshold and Candidate Pool Size
A language model is generating the next token in two different contexts. In both contexts, the model uses a sampling method where it forms a candidate pool by selecting the smallest set of the most probable tokens whose cumulative probability exceeds a threshold of 0.9.
- Context A: The single most probable token has a probability of 0.95.
- Context B: The ten most probable tokens each have a probability of 0.09.
How will the size of the candidate token pool compare between these two contexts?