Case Study

Evaluating Soft Prompt Generalization

A data scientist is developing a soft prompt to summarize legal documents. They use the formula hat(σ) = arg min_σ s(hat(y), hat(y)_σ) to find the optimal prompt hat(σ). In this process, hat(y) represents the model's desired summary (generated with full context) and hat(y)_σ is the summary generated using the soft prompt σ. The optimization successfully minimizes the dissimilarity s to near-zero on the training dataset. However, when tested on new, unseen legal documents, the soft prompt produces poor-quality summaries. Analyze the most likely reason for this discrepancy between high performance during training and poor performance in testing.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science