1Cademy - Prefix Tuning Architecture

Learn Before

Mechanism of Prompt Tuning at the Embedding Layer

Concept

Prefix Tuning Architecture

Prefix tuning is a parameter-efficient fine-tuning method where a sequence of trainable vectors, known as prefixes, are prepended to the hidden states at each layer of a Transformer model. For any given layer l, the input consists of the prefixes for that layer (e.g., $p_0^l, p_1^l$ ) followed by the hidden states ( $h_0^{l-1}, h_1^{l-1}, \ldots$ ) computed by the previous layer. The core LLM parameters remain frozen, while only these layer-specific prefix vectors are optimized during training to steer the model's output for a downstream task.