1Cademy - Mathematical Notation for Text Generation Probability

Learn Before

Core Topics in LLM Development and Scaling
Text Generation Probability

Formula

Mathematical Notation for Text Generation Probability

In language modeling, the probability of generating a specific text sequence, denoted as $\mathbf{y}$ , given a preceding context, denoted as $\mathbf{x}$ , is mathematically represented as $\Pr(\mathbf{y}|\mathbf{x})$ . This conditional probability notation is fundamental for formalizing text generation tasks, including those that involve adapting models to process very long token sequences.

Updated 2026-06-16

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course

Next Word	Model A Probability	Model B Probability
salt	0.65	0.20
concrete	0.02

Learn After

Classification of Long Sequence Modeling Problems
A user provides the input 'Translate this to Spanish: The sky is blue' to a language model. The model, which has a specific set of learned weights and biases, generates the output 'El cielo es azul.' In the context of the notation for text generation probability, Pr_θ(y|x), which of the following correctly identifies the components of this interaction?
Evaluating Model Outputs with Probabilistic Notation
A language model is tasked with summarizing a news article. Match each component of the probabilistic notation used to describe this process with its corresponding role in the summarization task.
General Notation for Conditional Probability in Sequence Generation

Learn Before

Related

Learn After