Learn Before
Analyzing Text Generation Behavior
Based on the principles of sorting and filtering candidate words by probability, which configuration in the case study likely uses a smaller 'k' value (the number of top candidates to retain) and which uses a larger 'k' value? Explain your reasoning by describing how the number of candidates kept after sorting affects the diversity of the generated text.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A company wants its customer service chatbot, which is powered by a large language model, to provide real-time order tracking information to users. The model was not trained on this specific, dynamic data, and the company wants to avoid the cost and complexity of constantly retraining the model. Which of the following approaches is the best example of using an external tool to enhance the model's capabilities at the time of use?
A language model is generating the next word and has calculated the initial probabilities for five potential tokens: 'mat' (0.45), 'floor' (0.25), 'windowsill' (0.15), 'couch' (0.10), and 'table' (0.05). If the model is configured to only consider the top 3 most probable tokens for the next step, which set of tokens is kept after the ranking and pruning stage?
Analyzing Text Generation Behavior