1Cademy - Sampling-based Search with Penalty Objective

Learn Before

Decoding Objective with Penalty Term

Concept

Sampling-based Search with Penalty Objective

When using a sampling-based search algorithm, a decoding objective with a penalty term can be applied by evaluating the generated candidate sequences. The sequences are ranked according to their penalized probability score, $\Pr(\mathbf{y}|\mathbf{x}) - \lambda \cdot \mathrm{Penalty}(\mathbf{x},\mathbf{y})$ , and the top-ranked sequences are then selected to form the pool of candidates.

Updated 2026-05-05

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related