Learn Before
Inferring Decoding Parameters
A language model is generating the next word after the phrase 'The cat sat on the'. The model's internal calculations produce the initial probabilities for the top five potential words as shown below. The model uses a decoding strategy where only a fixed number of the most likely candidates are considered, and all others are discarded. A final word is then randomly sampled from this smaller group. Analyze the scenario and determine the minimum possible value for the parameter that sets this fixed number of candidates. Explain your reasoning.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is generating the next word in a sequence and has calculated the initial probabilities for six potential words: 'the' (0.40), 'a' (0.25), 'an' (0.15), 'some' (0.10), 'any' (0.05), and 'every' (0.05). The system uses a decoding strategy where it only considers the top 4 most likely candidates for the final selection. After discarding the other candidates, the probabilities of the remaining words are adjusted to sum to 1. What is the adjusted probability for the word 'a'?
A text generation model uses a method to select the next word where it only considers a small, fixed number of the most probable options. Arrange the following steps to accurately describe the sequence of this method.
Inferring Decoding Parameters