1Cademy - Imagine a text generation system where a small, fast model first generates a short sequence of candidate tokens (e.g., C1, C2, C3). Then, a large, accurate model checks all these candidates at once. Lets say the system has already produced a confirmed sequence of tokens: `[The, cat, sat]`. The small model has just generated two candidate tokens in the current step: `[on, the]`. What information does the small model use to calculate the probability distribution for the *next* candidate

Learn Before

Conditional Probability Distribution of the Draft Model in Speculative Decoding

Multiple Choice

Imagine a text generation system where a small, fast model first generates a short sequence of candidate tokens (e.g., C1, C2, C3). Then, a large, accurate model checks all these candidates at once. Let's say the system has already produced a confirmed sequence of tokens: ['The', 'cat', 'sat']. The small model has just generated two candidate tokens in the current step: ['on', 'the']. What information does the small model use to calculate the probability distribution for the next candidate

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related