Formula

Mathematical Formulation of Verification Model Evaluation in Speculative Decoding

In speculative decoding, the verification model evaluates the entire sequence of τ\tau draft tokens,

{y^i+1,,y^i+τ},\{\hat{y}_{i+1}, \ldots, \hat{y}_{i+\tau}\},

in a single, parallel step. This is achieved by computing the conditional probability for each draft token using the verification model’s distribution, Prp\Pr_p.

The probability for each token y^i+t\hat{y}_{i+t} is conditioned on the original prefix [x,yi][\mathbf{x}, \mathbf{y}_{\le i}] and all preceding draft tokens y^i+1,,y^i+t1\hat{y}_{i+1}, \ldots, \hat{y}_{i+t-1}. The set of probabilities computed is:

{Prp(y^i+1x,yi),  ,  Prp(y^i+τx,yi,y^i+1,,y^i+τ1)}.\Big\{ \Pr_p(\hat{y}_{i+1} \mid \mathbf{x}, \mathbf{y}_{\le i}), \; \ldots, \; \Pr_p(\hat{y}_{i+\tau} \mid \mathbf{x}, \mathbf{y}_{\le i}, \hat{y}_{i+1}, \ldots, \hat{y}_{i+\tau-1}) \Big\}.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related