Learn Before
Activity (Process)

Top-p (Nucleus) Sampling Process

Top-p, or nucleus, sampling is a probabilistic text generation technique that involves a multi-stage process. First, in the expansion stage, all potential next tokens are generated. Second, these tokens are ranked by probability. Third, a 'nucleus' of the top-ranked tokens is selected, such that their cumulative probability exceeds a predefined threshold 'p'. The probabilities within this nucleus are then renormalized. Finally, a single token is sampled from this renormalized set to become the output. This method balances quality and diversity by filtering out the long tail of low-probability tokens.

Image 0

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences