Formal Definition of LLM Inference
The inference process in Large Language Models (LLMs) is formally defined as finding the most probable output sequence based on a given user context. Let denote the input token sequence (conceptually equivalent to a 'prompt'), which comprises tokens denoted by , where is the start symbol . Let denote the subsequent output token sequence (the response), comprising tokens denoted by . The output tokens preceding position are denoted as . The primary goal of LLM inference is to maximize the conditional probability , evaluating the context to determine the most likely sequence . Furthermore, the input and output can be concatenated into a single sequence (sometimes represented as ) to compute joint log-probabilities in decoder-only models.

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.5 Inference - Foundations of Large Language Models
Related
Probability Distribution Formula for an Encoder-Softmax Language Model
Auto-Regressive Generation Process
Formal Definition of LLM Inference
Model Parameterization by θ
A language model built with a deep neural network is given the input sequence 'The cat sat on the'. The model's vocabulary consists of the following tokens: {a, cat, hat, mat, on, sat, the}. What does the model produce as its immediate, direct output to predict the very next token?
Analyzing Language Model Outputs
Explaining Language Model Output Behavior
Formal Definition of LLM Inference
Prefilling-Decoding Frameworks
Evaluating Instructional Approaches for Technical Documentation
A computer scientist is documenting a new, mathematically-intensive process for generating text with a large language model. They choose to define technical symbols and variables as they are introduced throughout the document, rather than providing a consolidated list of notations at the beginning. Which of the following outcomes is the most probable result of this documentation strategy?
The Role of Notation in Technical Clarity
Formal Definition of LLM Inference
Notation for Preceding Output Subsequence
Deconstructing a Model's Generated Text
Representing Model Output as a Token Sequence
A Large Language Model generates the sentence: 'AI is transforming our world.' How is this output fundamentally structured by the model before being presented to the user?
Separating Input and Output Variables in LLM Formulation
Start of Sequence (SOS) Token
Formal Definition of LLM Inference
A user provides the input 'Summarize this article', which a language model processes into three distinct tokens ('Summarize', 'this', 'article'). Based on the formal structure where an input sequence is represented by its tokens plus a special start symbol, what is the total number of tokens in the complete sequence given to the model?
A language model receives an input prompt that is tokenized into 10 tokens. According to the formal representation of an input sequence, , which of the following correctly describes the structure of the complete sequence processed by the model?
A language model is given the complete input token sequence: . By analyzing the components of this sequence, identify which token's primary role is to signal the beginning of the input context for the model.
Example of Reframing Text Classification as Text Generation
Instruction-based Prompts
Few-Shot Learning
Alternative Prompt Formats for Machine Translation
Text Classification in NLP
Versatility of Prompt Templates
Grammaticality Judgment as a Binary Classification Task for LLMs
Formal Definition of LLM Inference
Illustrative Purpose of Prompting Examples
The paradigm of using Large Language Models (LLMs) allows for many different NLP tasks (e.g., translation, sentiment analysis) to be reframed as a text generation problem. What is the fundamental advantage of this approach over traditional methods that required building a separate, specifically trained model for each individual task?
Reframing a Traditional NLP Task
Choosing an NLP Development Strategy
Classification via Prompt Completion
Reframing Numerical Scoring as Text Generation
Learn After
Mathematical Formulation of LLM Inference
Single-Round Prediction Problem
Token-Level Representation of Input and Output Sequences for a Forward Pass
Multi-Round Prediction Problem
Notation for Concatenated Token Sequences
A language model is given an input sequence of tokens representing the phrase 'The best way to learn a new skill is'. The model then calculates the likelihood for several possible completing sequences. Based on the formal objective of the text generation process, which of the following sequences should the model select to output?
Analyzing Model Output Selection
A language model is given an input context
x. It then evaluates two potential output sequences,y_1andy_2. The model's internal calculations determine thaty_1has a higher probability of occurring afterxthany_2. However, a human evaluator findsy_2to be more creative and detailed. According to the formal objective of the text generation process, what should the model do?