1Cademy - A researcher is training a soft prompt, denoted as $\sigma$, to mimic the behavior of a full context, $c$, for a given input, $z$. They use the Kullback-Leibler (KL) divergence between the models output probability distributions as their objective function: $$\text{KL}(\text{Pr}(\cdot|c, z) \|\| \text{Pr}(\cdot|\sigma, z))$$ After extensive training, the researcher observes that the KL divergence has reached a value of 0. What is the most accurate conclusion to draw from this result?

Learn Before

Formula for Soft Prompt Optimization by Minimizing KL Divergence

Multiple Choice

A researcher is training a soft prompt, denoted as (\sigma), to mimic the behavior of a full context, (c), for a given input, (z). They use the Kullback-Leibler (KL) divergence between the model's output probability distributions as their objective function: $\text{KL}(\text{Pr}(\cdot|c, z) \|\| \text{Pr}(\cdot|\sigma, z))$ After extensive training, the researcher observes that the KL divergence has reached a value of 0. What is the most accurate conclusion to draw from this result?

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related