Activity (Process)

Best-of-N Sampling (Parallel Scaling)

Parallel scaling, also known as best-of-N sampling, is a strategy that involves generating K independent candidate solutions by running a base LLM multiple times. During this generation process, the sampling temperature can be adjusted to control the diversity of the outputs. After the candidates are created, a verifier evaluates each of the K complete solutions, and the one with the highest score is selected as the final answer. This method is conceptually analogous to using a reward model to select the best option from a set of sampled outputs.

0

1

Updated 2026-05-06

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related