Activity (Process)

Iterative Process of Speculative Decoding

Speculative decoding operates as a cyclical process. After a set of new tokens—comprising the accepted draft tokens and one final token from the verification model—is generated in a single step, this entire set is appended to the existing context. This updated, longer context is then used as the basis for the next iteration of the process.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences