Role of the Uniformly Distributed Random Variable () in Speculative Decoding
In the acceptance-rejection step of speculative decoding, is a random variable sampled from a uniform distribution between 0 and 1, denoted as . For each candidate token at position , a new value of is drawn and compared against the probability ratio of the target and draft models to decide whether to accept or reject the token.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Determining the Maximum Number of Consecutively Accepted Tokens in Speculative Decoding
Role of the Uniformly Distributed Random Variable () in Speculative Decoding
In a text generation process, a small, fast model proposes the next token as 'learning' with a probability of 0.8. A larger, more accurate model then evaluates this same token and assigns it a probability of 0.6. Based on the standard acceptance-rejection procedure used in this context, what is the outcome for the token 'learning'?
Evaluating Proposed Tokens in a Generation Process
In a text generation process that uses a draft model and a target model, if the draft model assigns a higher probability to a proposed token than the target model does, that token is automatically rejected.
Learn After
Formula for the Number of Consecutively Accepted Tokens in Speculative Decoding
In a system that uses a faster, smaller model to generate candidate tokens for a larger, more accurate model, a single token is being evaluated. The faster model assigns a probability of 0.8 to this token, while the more accurate model assigns it a probability of 0.6. For the acceptance check, a random number of 0.7 is drawn from a uniform distribution between 0 and 1. Based on this information, what is the outcome for this candidate token?
Speculative Decoding Acceptance Analysis
The Role of Randomness in Token Acceptance