1Cademy - A language model is generating the next word in a sequence and has calculated the initial probabilities for the five most likely candidates: `the` (0.4), `a` (0.2), `one` (0.1), `his` (0.05), and `her` (0.05). If the model uses a sampling strategy where it only considers the top 3 most likely candidates (k=3), what will be the new, rescaled probability distribution for this reduced set of candidates from which the final word will be sampled?

Learn Before

Top-k Sampling Process

Multiple Choice

A language model is generating the next word in a sequence and has calculated the initial probabilities for the five most likely candidates: the (0.4), a (0.2), one (0.1), his (0.05), and her (0.05). If the model uses a sampling strategy where it only considers the top 3 most likely candidates (k=3), what will be the new, rescaled probability distribution for this reduced set of candidates from which the final word will be sampled?

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related