Concept

Sampling-based Search with Penalty Objective

When using a sampling-based search algorithm, a decoding objective with a penalty term can be applied by evaluating the generated candidate sequences. The sequences are ranked according to their penalized probability score, Pr(yx)λPenalty(x,y)\Pr(\mathbf{y}|\mathbf{x}) - \lambda \cdot \mathrm{Penalty}(\mathbf{x},\mathbf{y}), and the top-ranked sequences are then selected to form the pool of candidates.

0

1

Updated 2026-05-05

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences