1Cademy - Example of Draft Token Generation in Speculative Decoding

Learn Before

Mathematical Formulation of Draft Model Prediction in Speculative Decoding

Example

Example of Draft Token Generation in Speculative Decoding

To illustrate the draft generation phase, which is the initial step in speculative decoding, consider a scenario where the draft model predicts a sequence of τ=5 candidate tokens. Starting with a given context, such as $(\mathbf{x}, \mathbf{y}_{<i})$ , the draft model utilizes its probability distribution Pr_q(·) to autoregressively generate a sequence of five tokens, for example, ˆy_{i+1}, ˆy_{i+2}, ˆy_{i+3}, ˆy_{i+4}, ˆy_{i+5}. Each token in this draft sequence is predicted based on the initial context and all previously generated draft tokens within the current step.

0

1

Updated 2025-10-09

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related

Learn After