Analyzing the Asymmetry in Soft Prompt Optimization
Consider the formula for finding an optimal soft prompt, , by minimizing the difference between two probability distributions: In this formula, is the probability distribution over possible outputs given a full context and an input , while is the distribution given a soft prompt and the same input .
Explain why is treated as the first argument (the 'true' distribution) and as the second argument within the KL divergence function, and not the other way around. What would be the conceptual implication of swapping their positions?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A researcher is training a soft prompt, denoted as (\sigma), to mimic the behavior of a full context, (c), for a given input, (z). They use the Kullback-Leibler (KL) divergence between the model's output probability distributions as their objective function: After extensive training, the researcher observes that the KL divergence has reached a value of 0. What is the most accurate conclusion to draw from this result?
Evaluating Soft Prompt Performance
Analyzing the Asymmetry in Soft Prompt Optimization