LLM Prediction with Compressed Context
The prediction of a Large Language Model, denoted as , when using a soft prompt (a compressed context) and an input , is determined by selecting the output that maximizes the conditional probability. This is formally expressed as: This prediction is compared against the prediction from the full context to optimize the soft prompt.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
LLM Prediction with Full Context
LLM Prediction with Compressed Context
Mathematical Formulation of Prompt Ensembling
Formula for Scoring Reasoning Paths by Counting Correct Steps
A classification model is given an input,
x, and must choose an output,y, from the set of possible classes {A, B, C, D}. The model's decision rule is to select the class that has the highest conditional probability,Pr(y|x). Given the following probabilities calculated by the model for the inputx, what will its final prediction be?Pr(y=A | x)= 0.15Pr(y=B | x)= 0.55Pr(y=C | x)= 0.25Pr(y=D | x)= 0.05
Model Prediction vs. Ground Truth
Analyzing a Model's Prediction Choice
Learn After
Formula for Optimizing Soft Prompts via Context Compression
Formula for Soft Prompt Optimization by Minimizing KL Divergence
An LLM is provided with a compressed representation of context, denoted as
σ, and an inputz. The model's goal is to predict the most likely outputy. After processingσandz, the model computes the following conditional probabilities for four possible outputs:- Pr(y='mat' | σ, z) = 0.65
- Pr(y='roof' | σ, z) = 0.25
- Pr(y='sky' | σ, z) = 0.05
- Pr(y='idea' | σ, z) = 0.05
Based on the principle of selecting the output that maximizes the conditional probability, what will the model's final prediction,
ŷ_σ, be?Deconstructing the LLM Prediction Formula
Analyzing an LLM's Incorrect Prediction