Learn Before
Sampling-based Search with Penalty Objective
When using a sampling-based search algorithm, a decoding objective with a penalty term can be applied by evaluating the generated candidate sequences. The sequences are ranked according to their penalized probability score, , and the top-ranked sequences are then selected to form the pool of candidates.
0
1
Tags
Foundations of Large Language Models
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Penalty Function in Controllable Decoding
A developer is using a language model for text summarization. The model's outputs are generally fluent but suffer from excessive repetition of certain phrases. To address this, the developer employs a decoding objective that penalizes repetition, formulated as:
argmax [Pr(y|x) - λ * Penalty(x, y)], wherePenalty(x, y)increases with the amount of repetition in the outputy. How should the developer adjust the hyperparameterλto make the summaries less repetitive?Analyzing the Trade-off in Penalized Decoding
Consider the decoding objective for controllable text generation:
ŷ = argmax [Pr(y|x) - λ * Penalty(x, y)]. If the hyperparameterλis set to 0, the objective simplifies to finding the output with the highest conditional probability, effectively ignoring any penalty.Greedy Search with Penalty Objective
Sampling-based Search with Penalty Objective