Learn Before
Concept

Balancing Randomness and Coherence in Token Sampling

Sampling-based decoding methods like Top-kk and Top-pp restrict the selection pool to a smaller subset of high-probability candidates, effectively striking a balance between output randomness and text coherence. This restriction enables the large language model to generate more diverse sequences while maintaining relevance and fluency. The hyperparameters kk and pp must be tuned carefully: excessively small values yield highly deterministic outputs that closely resemble greedy decoding, whereas overly large values can cause the model to produce degenerate outputs.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences