1Cademy - Mathematical Formulation of LLM Inference

Learn Before

Formula

Mathematical Formulation of LLM Inference

The inference process for Large Language Models is mathematically defined as identifying the most probable output sequence $\mathbf{y}$ based on a given input context $\mathbf{x}$ . This involves determining the sequence $\hat{\mathbf{y}}$ that maximizes the conditional log-probability: $\hat{\mathbf{y}} = \argmax_{\mathbf{y}} \log \Pr(\mathbf{y} | \mathbf{x})$ . To account for the step-by-step nature of text generation, this equation calculates the sum of the log-probabilities for predicting each individual token $y_i$ starting from position $m+1$ , rather than position ${}0$ . Each token's probability is conditioned on the initial context sequence ( $x_0,...,x_m$ ) and all prior generated tokens ( $y_1,...,y_{i-1}$ ): $\hat{\mathbf{y}} = \argmax_{\mathbf{y}} \sum_{i=1}^{n} \log \Pr(y_i|x_0,...,x_m,y_1,...,y_{i-1})$ .

Updated 2026-04-19

Contributors are:

Who are from:

References

Learn Before

Related

Learn After