Formula

Mathematical Formulation of LLM Inference

The inference process for Large Language Models is mathematically defined as identifying the most probable output sequence y\mathbf{y} based on a given input context x\mathbf{x}. This involves determining the sequence y^\hat{\mathbf{y}} that maximizes the conditional log-probability: y^=arg maxylogPr(yx)\hat{\mathbf{y}} = \argmax_{\mathbf{y}} \log \Pr(\mathbf{y} | \mathbf{x}). To account for the step-by-step nature of text generation, this equation calculates the sum of the log-probabilities for predicting each individual token yiy_i starting from position m+1m+1, rather than position 0{}0. Each token's probability is conditioned on the initial context sequence (x0,...,xmx_0,...,x_m) and all prior generated tokens (y1,...,yi1y_1,...,y_{i-1}): y^=arg maxyi=1nlogPr(yix0,...,xm,y1,...,yi1)\hat{\mathbf{y}} = \argmax_{\mathbf{y}} \sum_{i=1}^{n} \log \Pr(y_i|x_0,...,x_m,y_1,...,y_{i-1}).

Image 0

0

1

Updated 2026-04-19

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related
Learn After