Output Selection in a Prefix-Tuned Transformer Layer
In a Transformer layer adapted for prefix fine-tuning, the output passed to the next layer consists only of the final hidden state representations. While the layer's input is a combination of trainable prefix vectors and the previous layer's output, the representations corresponding to the prefixes are discarded after computation. This selected output, containing only the hidden states for the original input sequence, then serves as the input component from the previous layer for the subsequent layer in the network, maintaining a consistent structure across layers.

0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Output Selection in a Prefix-Tuned Transformer Layer
An internal layer of a large language model is adapted for a new task. Its input is a single matrix created by concatenating a sequence of newly introduced, task-specific vectors with the sequence of hidden state vectors produced by the preceding layer. Which statement correctly analyzes the properties of these two constituent sequences?
Input Matrix Dimension Calculation
Consider a Transformer layer where the input is formed by prepending a sequence of new, adjustable vectors to the sequence of hidden state outputs from the layer below. In this setup, every vector within the combined input matrix for this layer is a trainable parameter.
Your team is building a multi-tenant LLM service w...
You’re reviewing an internal design doc for adapti...
You’re implementing a PEFT approach for a customer...
You’re reviewing a teammate’s claim about a new PE...
Diagnosing a PEFT Implementation Bug: Prompt Tuning vs Prefix Fine-Tuning
Choosing and Explaining a PEFT Strategy Under Deployment Constraints
Selecting Prompt Tuning vs Prefix Fine-Tuning by Reasoning from Where Soft Prompts Enter the Transformer
Post-Deployment PEFT Choice and Prefix Input Composition for a Multi-Tenant LLM Service
Choosing Between Prompt Tuning and Prefix Fine-Tuning for a Latency-Critical, Multi-Task LLM Service
Root-Causing a Prefix-Tuning Rollout Regression in a Multi-Task LLM Platform
Learn After
Output Selection Formula in a Prefix-Tuned Transformer Layer
Inter-Layer Data Flow in Prefix-Tuning
Consequences of Output Selection in a Modified Transformer
In a Transformer layer adapted for prefix-tuning, the input consists of a set of trainable prefix vectors followed by the hidden states from the original input sequence. After this combined input is processed by the layer, the resulting hidden states corresponding to the prefix vectors are discarded, and only the states for the original sequence are passed on. What is the most critical reason for this selective output process?
In a Transformer architecture modified for prefix-tuning, the hidden state representations corresponding to the trainable prefix vectors are passed along with the main input's hidden states to the subsequent layer to ensure the model has access to the learned task-specific information at every stage.