A research team is developing a system to answer questions based on a large document. Instead of feeding the entire document into a language model for every question, they want to learn a compressed, continuous representation of the document (a 'soft prompt', σ). Their process is as follows:
- First, for a given question (z), they run the model with the full document to get a high-quality, 'gold standard' answer (ŷ).
- Next, they try to find the optimal soft prompt (σ) that, when paired with the original question (z), causes the model to produce that same 'gold standard' answer (ŷ).
They define the 'optimal' soft prompt as the one that makes the probability of generating the 'gold standard' answer as high as possible. Based on this optimization strategy, which statement best describes the primary goal?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A research team is developing a system to answer questions based on a large document. Instead of feeding the entire document into a language model for every question, they want to learn a compressed, continuous representation of the document (a 'soft prompt', σ). Their process is as follows:
- First, for a given question (z), they run the model with the full document to get a high-quality, 'gold standard' answer (ŷ).
- Next, they try to find the optimal soft prompt (σ) that, when paired with the original question (z), causes the model to produce that same 'gold standard' answer (ŷ).
They define the 'optimal' soft prompt as the one that makes the probability of generating the 'gold standard' answer as high as possible. Based on this optimization strategy, which statement best describes the primary goal?
Interpreting the Soft Prompt Optimization Formula
A team is training a soft prompt (σ) to help a language model generate a specific, high-quality target sentence (ŷ) when given an input (z). They are considering two different optimization objectives:
- Objective 1: Adjust the soft prompt σ to maximize the probability of the model generating the exact target sentence ŷ.
- Objective 2: Adjust the soft prompt σ so that the model's entire probability distribution over the next possible word matches the distribution it would have had if it were conditioned on the full, original context instead of the prompt.
Which statement best evaluates the fundamental difference in what these two objectives are trying to achieve?