Activity (Process)

Post-Acceptance Token Generation in Speculative Decoding

Once the number of consecutively accepted draft tokens, nan_a, is known, these tokens are added to the final output. The process then continues by using the evaluation model to predict and generate the very next token at position i+na+1i + n_a + 1, extending the sequence autoregressively from this new point.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related