Concept

Top-p (Nucleus) Sampling

Top-p sampling, also known as nucleus sampling, is a decoding method that selects the next token from a dynamically sized candidate pool. This pool is formed by identifying the smallest set of the most probable tokens whose cumulative probability exceeds a predefined threshold 'p' [Holtzman et al., 2020]. By constructing the candidate pool in this manner, the method avoids selecting low-probability tokens from the long tail of the distribution, which helps prevent the generation of incoherent or nonsensical text.

Image 0

0

1

Updated 2026-05-05

Tags

Data Science

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences