Concept

Output Selection in a Prefix-Tuned Transformer Layer

In a Transformer layer adapted for prefix fine-tuning, the output passed to the next layer consists only of the final m+1m+1 hidden state representations. While the layer's input is a combination of trainable prefix vectors and the previous layer's output, the representations corresponding to the prefixes are discarded after computation. This selected output, containing only the hidden states for the original input sequence, then serves as the input component from the previous layer for the subsequent layer in the network, maintaining a consistent structure across layers.

Image 0

0

1

Updated 2026-01-15

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences