Learn Before
A language model is generating the next word in a sequence and has calculated the initial probabilities for the five most likely candidates: the (0.4), a (0.2), one (0.1), his (0.05), and her (0.05). If the model uses a sampling strategy where it only considers the top 3 most likely candidates (k=3), what will be the new, rescaled probability distribution for this reduced set of candidates from which the final word will be sampled?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Example of Top-k Sampling with k=3
Top-k Selection Pool
Probability Renormalization Formula for Restricted Vocabulary Sampling
Probability Renormalization Formula for Top-k Sampling
A language model is generating the next word in a sequence and has calculated the initial probabilities for the five most likely candidates:
the(0.4),a(0.2),one(0.1),his(0.05), andher(0.05). If the model uses a sampling strategy where it only considers the top 3 most likely candidates (k=3), what will be the new, rescaled probability distribution for this reduced set of candidates from which the final word will be sampled?Arrange the following actions into the correct sequence that describes the process of selecting the next token in a text generation model using the top-k sampling method.
Analyzing Text Generation Outputs