Formula

Input Composition in a Prefix-Tuned Transformer Layer

In prefix fine-tuning, the input sequence for a given layer ll, denoted as Hl\mathbf{H}^{l}, is constructed by prepending a sequence of trainable prefix vectors before the hidden state outputs from the previous layer. The formula for this composition is: Hl=p0l p1l ... pnltrainableh0l h1l ... hmlprevious layer output\mathbf{H}^l = \underbrace{\mathbf{p}_0^l\ \mathbf{p}_1^l\ ...\ \mathbf{p}_n^l}_{\text{trainable}} \underbrace{\mathbf{h}_0^l\ \mathbf{h}_1^l\ ...\ \mathbf{h}_m^l}_{\text{previous layer output}} Here, p0l\mathbf{p}_0^l to pnl\mathbf{p}_n^l are the trainable prefix vectors specific to layer ll, and h0l\mathbf{h}_0^l to hml\mathbf{h}_m^l represent the selected hidden states from the output of the preceding layer.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related