Definition

Set of Tokens Generated in a Single Speculative Decoding Step

In a single step of speculative decoding, the set of newly generated tokens that extends the existing sequence is composed of the consecutively accepted draft tokens and one final token from the verification model. This set is formally represented as: {y^i+1,...,y^i+na,yˉi+na+1}\lbrace\hat{y}_{i+1}, ..., \hat{y}_{i+n_a}, \bar{y}_{i+n_a+1}\rbrace where {y^i+1,...,y^i+na}\{\hat{y}_{i+1}, ..., \hat{y}_{i+n_a}\} are the nan_a accepted draft tokens and yˉi+na+1\bar{y}_{i+n_a+1} is the token generated by the verification model. A more general, simplified notation for this set is {y^,...,y^,yˉ}\{\hat{y}, ..., \hat{y}, \bar{y}\}, highlighting the composition of accepted draft tokens and a single verification model token.

Image 0

0

1

Updated 2025-10-07

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences