Learn Before
Sampling-Based Search for LLM Inference
To introduce variation into the outputs of Large Language Models and overcome the lack of diversity found in deterministic approaches, sampling-based decoding methods are used. These heuristic algorithms approximate the optimal output by drawing samples from the model's probability distribution. This allows for the exploration of different potential sequences, making it a suitable technique for creative applications.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Sampling-Based Search for LLM Inference
Sequence Evaluation using Log-Probability
Deterministic Decoding Algorithms
Modifying the Search Objective to Improve Decoding
Maximum a Posteriori (MAP) Decoding
Speculative Decoding
Structured Search in Decoding
Trade-off between Search Quality and Computational Efficiency in Heuristic Search
An engineer is building a real-time chatbot that must respond to user queries very quickly. To achieve this speed, the engineer implements a text generation strategy that, at each step of forming a response, considers only a small subset of the most likely next words instead of all possible words in the vocabulary. What is the fundamental trade-off inherent in this design choice?
Evaluating a Decoding Algorithm Claim
Analysis of Competing Text Generation Systems
Learn After
A company is developing a system to automatically generate concise, factual summaries of legal documents. The system's primary requirements are high reliability and consistency, meaning the same document must always produce the exact same summary. The engineering team proposes using a text generation model that employs a sampling-based search method. Which statement best evaluates this proposal?
Rationale for Sampling in Creative Text Generation
Analyzing LLM Output Variability
Top-k Sampling