Concept

Context Distillation into Prompt Embeddings

Applying knowledge distillation to context compression involves treating the full-context prediction as the teacher model and the compressed-context prediction as the student model. Unlike standard context distillation where the compressed context uses discrete tokens, this method distills the context c\mathbf{c} into real-valued vectors σ\sigma, which act as prompt embeddings. Furthermore, the teacher and student models are not required to share the same architecture; typically, a stronger model serves as the teacher, while a smaller, more efficient model acts as the student.

0

1

Updated 2026-04-30

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences