Formula

Output Selection Formula in a Prefix-Tuned Transformer Layer

The selection of the last m+1m+1 hidden states in a prefix-tuned Transformer layer is expressed mathematically. The output for the next layer, Hl+1\overline{\mathbf{H}}^{l+1}, is derived by applying the layer's transformation to the full input Hl\mathbf{H}^l and then slicing the resulting sequence to retain only the final m+1m+1 vectors. The formula is: Hl+1=Layer(Hl)[m1:]=h0l+1h1l+1... hml+1\overline{\mathbf{H}}^{l+1} = \mathrm{Layer}(\mathbf{H}^{l})[-m-1:] = \mathbf{h}_0^{l+1}\mathbf{h}_1^{l+1}...\ \mathbf{h}_m^{l+1} where [m1:][-m-1:] denotes the slicing operation.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences