Definition

Set of Accepted Speculative Tokens

The notation {\hat{y}_{i+1}, ..., \hat{y}_{i+n_a}} represents the sequence of tokens generated by a draft model that have been consecutively accepted by the target model in speculative decoding. The sequence begins at index i+1 and includes n_a tokens, where n_a is the number of accepted tokens determined before the first rejection occurs.

Image 0

0

1

Updated 2025-10-07

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences