1Cademy - Notational Convention for Autoregressive Conditional Probability

Learn Before

Auto-regressive Decomposition of Conditional Log-Likelihood

Definition

Notational Convention for Autoregressive Conditional Probability

In autoregressive models, the notation for the conditional probability of a token, Pr(y_i|x, y_{<i}), is a common shorthand. It signifies the probability of token y_i conditioned on the single sequence formed by concatenating the input x with the preceding output tokens y_{<i}. A more explicit, but less frequently used, notation for this is Pr(y_i|[x, y_{<i}]), where [x, y_{<i}] represents the full context for the prediction.