Learn Before
Consider the decoding objective for controllable text generation: ŷ = argmax [Pr(y|x) - λ * Penalty(x, y)]. If the hyperparameter λ is set to 0, the objective simplifies to finding the output with the highest conditional probability, effectively ignoring any penalty.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Penalty Function in Controllable Decoding
A developer is using a language model for text summarization. The model's outputs are generally fluent but suffer from excessive repetition of certain phrases. To address this, the developer employs a decoding objective that penalizes repetition, formulated as:
argmax [Pr(y|x) - λ * Penalty(x, y)], wherePenalty(x, y)increases with the amount of repetition in the outputy. How should the developer adjust the hyperparameterλto make the summaries less repetitive?Analyzing the Trade-off in Penalized Decoding
Consider the decoding objective for controllable text generation:
ŷ = argmax [Pr(y|x) - λ * Penalty(x, y)]. If the hyperparameterλis set to 0, the objective simplifies to finding the output with the highest conditional probability, effectively ignoring any penalty.Greedy Search with Penalty Objective
Sampling-based Search with Penalty Objective