1Cademy - Constructing the Input Hidden State for a Prefix-Tuned Layer

Learn Before

Composition of Hidden States in a Prefix-Tuned Layer

Short Answer

Constructing the Input Hidden State for a Prefix-Tuned Layer

In a transformer model adapted with prefix vectors, consider the input to layer l+1. The prefix for this layer consists of 10 trainable vectors. The hidden states corresponding to the original text input, which were processed by the previous layer, form a sequence of 512 vectors. Each vector in the model has a dimension of 768. Describe the structure of the complete hidden state sequence that is fed into the self-attention mechanism of layer l+1, and state its final dimensions.

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related