Formula for Soft Prompt Optimization by Minimizing KL Divergence
An alternative approach to optimizing soft prompts involves minimizing the Kullback-Leibler (KL) divergence between the output probability distribution from the full context, , and the distribution from the soft prompt, . The goal is to find the soft prompt that makes these two distributions as similar as possible. The optimization is expressed by the formula:

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Formula for Optimizing Soft Prompts via Context Compression
Formula for Soft Prompt Optimization via Log-Likelihood Maximization
Formula for Soft Prompt Optimization by Minimizing KL Divergence
An inference engine using a continuous batching strategy is currently processing a set of text generation requests that fully utilizes its processing capacity. At this point, a new, additional request arrives. What is the most likely immediate action the system's scheduler will take regarding this new request?
A language model is provided with a context
c('Translate the following sentence for a medical professional') and an inputz('Le patient présente une pyrexie'). The model computes the conditional probabilities for several potential English translations (y). Based on the principle of selecting the output that maximizes the conditional probability given the full context and input, which translation should the model choose as its prediction?Analyzing Contextual Influence on LLM Predictions
Formula for Optimizing Soft Prompts via Context Compression
Formula for Soft Prompt Optimization by Minimizing KL Divergence
An LLM is provided with a compressed representation of context, denoted as
σ, and an inputz. The model's goal is to predict the most likely outputy. After processingσandz, the model computes the following conditional probabilities for four possible outputs:- Pr(y='mat' | σ, z) = 0.65
- Pr(y='roof' | σ, z) = 0.25
- Pr(y='sky' | σ, z) = 0.05
- Pr(y='idea' | σ, z) = 0.05
Based on the principle of selecting the output that maximizes the conditional probability, what will the model's final prediction,
ŷ_σ, be?Deconstructing the LLM Prediction Formula
Analyzing an LLM's Incorrect Prediction
Formula for Soft Prompt Optimization by Minimizing KL Divergence
Derivation of the KL Divergence Objective for Policy Optimization
A machine learning model produces a probability distribution Q over a set of outcomes, aiming to approximate a true data distribution P. During evaluation, you observe that the divergence measure is low, while the reverse measure is high. Based on these results, what is the most likely characteristic of the model's distribution Q?
Calculating Divergence Between Distributions
Choosing a Loss Function for Model Distillation
Formula for Soft Prompt Optimization via Log-Likelihood Maximization
Formula for Soft Prompt Optimization by Minimizing KL Divergence
A team is creating a soft prompt to summarize a complex user manual for a question-answering model. Their main objective is not just to get the single correct answer, but to ensure the model's uncertainty and its ranking of other plausible-but-incorrect answers are the same with the soft prompt as they were with the full manual. Which of the following optimization strategies best aligns with this specific objective?
Choosing an Optimization Strategy for Soft Prompts
A researcher is optimizing a soft prompt. With the original, long context, the model predicts the correct answer with 60% probability and a plausible alternative with 30% probability. The researcher's goal is to create a soft prompt that causes the model to predict the correct answer with over 95% probability, even if this significantly changes the probability of the alternative answer. Which optimization approach is better suited for this specific goal?
Learn After
A researcher is training a soft prompt, denoted as (\sigma), to mimic the behavior of a full context, (c), for a given input, (z). They use the Kullback-Leibler (KL) divergence between the model's output probability distributions as their objective function: After extensive training, the researcher observes that the KL divergence has reached a value of 0. What is the most accurate conclusion to draw from this result?
Evaluating Soft Prompt Performance
Analyzing the Asymmetry in Soft Prompt Optimization