Learn Before
Interpreting Model Predictions
A predictive model is generating a sequence of words. At a specific point in time i, it has generated the phrase 'The sky is'. The model then calculates a value for two potential next words, t=1 step into the future. It finds that and . Based on this information, which word is the model more likely to choose next, and what does the function represent in this context?
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Acceptance-Rejection Mechanism for Speculative Decoding
Inequality Constraint for Predicted Future Value Functions ()
Condition for Rejecting Speculation
Consider a text generation system that uses a fast, approximate model to propose a potential future word. For each proposed word, a more accurate but slower model also calculates a probability. Suppose at a certain step
i, the fast model predicts the next word will be 'universe' (represented as ). The fast model's confidence in this specific prediction is calculated and denoted as . Based on this information, what is the most accurate interpretation of the value 0.8?In the context of a system that generates sequences of values (like words in a sentence), the expression is often used. Match each component of this expression to its correct description.
Rejection Criterion in Speculative Sampling
Interpreting Model Predictions