Learn Before
Formula
KL Divergence Objective for Distilling Context into Soft Prompts
An alternative objective for distilling a full context into continuous soft prompt embeddings is to minimize the Kullback-Leibler (KL) divergence between the output distributions of the teacher and student models. This objective is given by , which directly aligns the student model's probability distribution given the compressed context and input with the teacher model's distribution given the full context and input .
0
1
Updated 2026-04-30
Tags
Foundations of Large Language Models
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences