Steering Language Model Output for Slogan Generation
A marketing team is using a language model to generate slogans. The model's initial probability for a slogan y given a product description x is given by π(y|x). The team finds that many of the highest-probability slogans are generic. To encourage more creative outputs, they decide to modify the likelihood of each slogan using the formula: New Score(y) = π(y|x) * exp(r(x, y)), where r(x, y) is a reward value assigned to each slogan.
Explain how you would design the reward function r(x, y) to achieve the team's goal. Specifically, describe what a positive reward, a negative reward, and a zero reward would signify in this context and how each would affect a slogan's final score.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Re-weighting a Reference Probability Distribution with a Scaled Reward
A language model is generating a completion for an input
x. The model has a base probability distribution,π(y|x), for four potential completions (y). To steer the model's output, a reward function,r(x, y), is applied to create a new unnormalized score for each completion using the formula:Score(y) = π(y|x) * exp(r(x, y)). Given the values below, which completion will have the highest score?When using the formula
Score(y) = π(y|x) * exp(r(x, y))to adjust the likelihood of a potential outputy, setting the rewardr(x, y)to zero will cause the final score for that output to become zero, effectively eliminating it from consideration.Steering Language Model Output for Slogan Generation