Learn Before
Formula
Log-Likelihood Objective for Distilling Context into Soft Prompts
When applying knowledge distillation to compress context into soft prompts, a simple training objective seeks to maximize the log-likelihood of the teacher model's prediction given the compressed representation. This is formalized as , where is the prediction from the teacher model using the full context, represents the continuous prompt embeddings, and is the user input.
0
1
Updated 2026-04-30
Tags
Foundations of Large Language Models
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences