1Cademy - Formula for Soft Prompt Optimization by Minimizing Prediction Dissimilarity

Learn Before

Soft Prompt Learning as Context Compression via Knowledge Distillation

Formula

Formula for Soft Prompt Optimization by Minimizing Prediction Dissimilarity

The optimal soft prompt, denoted as $\hat{\sigma}$ , can be determined by finding the prompt that minimizes the dissimilarity between the model's predictions with and without the full context. This is expressed by the formula: $\hat{\sigma} = \arg \min_{\sigma} s(\hat{\mathbf{y}}, \hat{\mathbf{y}}_{\sigma})$ Here, $s$ is a function measuring the dissimilarity (e.g., distance) between the prediction from the full context, $\hat{\mathbf{y}}$ , and the prediction using the soft prompt, $\hat{\mathbf{y}}_{\sigma}$ . This method aligns the compact prompt's behavior with that of the original, more descriptive prompt.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related

Learn After