In a text generation process that uses a draft model and a target model, if the draft model assigns a higher probability to a proposed token than the target model does, that token is automatically rejected.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Determining the Maximum Number of Consecutively Accepted Tokens in Speculative Decoding
Role of the Uniformly Distributed Random Variable () in Speculative Decoding
In a text generation process, a small, fast model proposes the next token as 'learning' with a probability of 0.8. A larger, more accurate model then evaluates this same token and assigns it a probability of 0.6. Based on the standard acceptance-rejection procedure used in this context, what is the outcome for the token 'learning'?
Evaluating Proposed Tokens in a Generation Process
In a text generation process that uses a draft model and a target model, if the draft model assigns a higher probability to a proposed token than the target model does, that token is automatically rejected.