1Cademy - Decoding Phase Goal Formula

Learn Before

Decoding Phase in Transformer Inference

Formula

Decoding Phase Goal Formula

In large language models, the objective of the decoding phase is to find the best predicted sequence of tokens. Instead of conditioning the prediction directly on the original input sequence, the generation process relies entirely on the contextual representation built during the preceding prefilling stage. The optimal predicted sequence, denoted as $\hat{\mathbf{y}}$ , is determined by maximizing the conditional probability over this context: $\hat{\mathbf{y}} = \argmax_{\mathbf{y}} \Pr(\mathbf{y}|\mathrm{cache})$ where $\mathrm{cache}$ refers to the accumulated Key-Value (KV) cache.

Updated 2026-05-03

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related