1Cademy - Context Distillation into Prompt Embeddings

Learn Before

Objective Function for Context Compression into Soft Prompts

Concept

Context Distillation into Prompt Embeddings

Applying knowledge distillation to context compression involves treating the full-context prediction as the teacher model and the compressed-context prediction as the student model. Unlike standard context distillation where the compressed context uses discrete tokens, this method distills the context $\mathbf{c}$ into real-valued vectors $\sigma$ , which act as prompt embeddings. Furthermore, the teacher and student models are not required to share the same architecture; typically, a stronger model serves as the teacher, while a smaller, more efficient model acts as the student.

0

1

Updated 2026-04-30

Contributors are:

Who are from:

References

Learn Before

Related

Learn After