Evaluating Proposed Tokens in a Generation Process
A text generation system uses a small 'draft' model and a large 'target' model to speed up output. The draft model proposes a sequence of tokens, and an acceptance-rejection mechanism decides whether to keep them. For each proposed token, analyze the provided probabilities and determine if the token is (A) accepted outright, or (B) subject to a probabilistic check. If it's subject to a probabilistic check, calculate the specific probability of rejection.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Determining the Maximum Number of Consecutively Accepted Tokens in Speculative Decoding
Role of the Uniformly Distributed Random Variable () in Speculative Decoding
In a text generation process, a small, fast model proposes the next token as 'learning' with a probability of 0.8. A larger, more accurate model then evaluates this same token and assigns it a probability of 0.6. Based on the standard acceptance-rejection procedure used in this context, what is the outcome for the token 'learning'?
Evaluating Proposed Tokens in a Generation Process
In a text generation process that uses a draft model and a target model, if the draft model assigns a higher probability to a proposed token than the target model does, that token is automatically rejected.