Learn Before
Greedy Search with Penalty Objective
In a greedy search algorithm, a decoding objective with a penalty term can be incorporated by evaluating candidate tokens and keeping only the single sequence that maximizes the penalized objective, , at each individual decoding step.
0
1
Tags
Foundations of Large Language Models
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Penalty Function in Controllable Decoding
A developer is using a language model for text summarization. The model's outputs are generally fluent but suffer from excessive repetition of certain phrases. To address this, the developer employs a decoding objective that penalizes repetition, formulated as:
argmax [Pr(y|x) - λ * Penalty(x, y)], wherePenalty(x, y)increases with the amount of repetition in the outputy. How should the developer adjust the hyperparameterλto make the summaries less repetitive?Analyzing the Trade-off in Penalized Decoding
Consider the decoding objective for controllable text generation:
ŷ = argmax [Pr(y|x) - λ * Penalty(x, y)]. If the hyperparameterλis set to 0, the objective simplifies to finding the output with the highest conditional probability, effectively ignoring any penalty.Greedy Search with Penalty Objective
Sampling-based Search with Penalty Objective