Formula

Formula for Soft Prompt Optimization by Minimizing KL Divergence

An alternative approach to optimizing soft prompts involves minimizing the Kullback-Leibler (KL) divergence between the output probability distribution from the full context, Pr(c,z)\text{Pr}(\cdot|\mathbf{c}, \mathbf{z}), and the distribution from the soft prompt, Pr(σ,z)\text{Pr}(\cdot|\sigma, \mathbf{z}). The goal is to find the soft prompt σ^\hat{\sigma} that makes these two distributions as similar as possible. The optimization is expressed by the formula: σ^=argminσKL(Pr(c,z)Pr(σ,z))\hat{\sigma} = \underset{\sigma}{\arg\min}\, \text{KL}(\text{Pr}(\cdot|\mathbf{c}, \mathbf{z}) \|\| \text{Pr}(\cdot|\sigma, \mathbf{z}))

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related