Learn Before
A language model is generating the next word in a sequence and has calculated the initial probabilities for six potential words: 'the' (0.40), 'a' (0.25), 'an' (0.15), 'some' (0.10), 'any' (0.05), and 'every' (0.05). The system uses a decoding strategy where it only considers the top 4 most likely candidates for the final selection. After discarding the other candidates, the probabilities of the remaining words are adjusted to sum to 1. What is the adjusted probability for the word 'a'?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is generating the next word in a sequence and has calculated the initial probabilities for six potential words: 'the' (0.40), 'a' (0.25), 'an' (0.15), 'some' (0.10), 'any' (0.05), and 'every' (0.05). The system uses a decoding strategy where it only considers the top 4 most likely candidates for the final selection. After discarding the other candidates, the probabilities of the remaining words are adjusted to sum to 1. What is the adjusted probability for the word 'a'?
A text generation model uses a method to select the next word where it only considers a small, fixed number of the most probable options. Arrange the following steps to accurately describe the sequence of this method.
Inferring Decoding Parameters