Sequence Evaluation using Log-Probability
In text generation, candidate sequences are evaluated based on their log-probability. Given an input x, the quality of a potential output sequence y (composed of tokens y1...yi) is measured by log Pr(y|x). Using log-probabilities is a standard practice because it simplifies computation; the joint probability of a sequence, which is a product of conditional probabilities, becomes a sum of log-probabilities. This conversion is more numerically stable and computationally easier to manage.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Sampling-Based Search for LLM Inference
Sequence Evaluation using Log-Probability
Deterministic Decoding Algorithms
Modifying the Search Objective to Improve Decoding
Maximum a Posteriori (MAP) Decoding
Speculative Decoding
Structured Search in Decoding
Trade-off between Search Quality and Computational Efficiency in Heuristic Search
An engineer is building a real-time chatbot that must respond to user queries very quickly. To achieve this speed, the engineer implements a text generation strategy that, at each step of forming a response, considers only a small subset of the most likely next words instead of all possible words in the vocabulary. What is the fundamental trade-off inherent in this design choice?
Evaluating a Decoding Algorithm Claim
Analysis of Competing Text Generation Systems
Sequence Evaluation using Log-Probability
An engineer is using a generative language model to decide which of two possible sentences is a more likely completion for the input prompt 'Once upon a time,'. The model can compute various log-probability scores. To select the better completion, which of the following scores should the engineer compare for each candidate sentence?
Debugging a Language Model's Output Score
Rationale for Log-Probability Calculation in Generative Models
Core Computational Task in Autoregressive Generation
Step-by-Step Sequence Log-Probability Computation
Learn After
Incremental Calculation of Sequence Log-Probability
Example of Autoregressive Generation and Log-Probability Calculation
A language model is generating a continuation for the input 'The best way to learn a new skill is'. It has produced two candidate sequences and calculated their total log-probabilities as follows:
- Sequence A: '...by practicing consistently.' (Total log-probability = -1.15)
- Sequence B: '...through osmotic absorption.' (Total log-probability = -7.82)
Based on these values, which sequence is considered more plausible by the model, and why?
When a language model evaluates different possible output sequences, why is it standard practice to sum their log-probabilities instead of multiplying their raw probabilities?
A language model has generated the sequence 'The sun is' with a cumulative log-probability of -2.5. The model is now considering the next token. Given the following conditional log-probabilities for the next token, which choice would result in the most probable three-word sequence?