1Cademy - Consider a prefix-tuned Transformer layer where the full input `H^l` is composed of prefix vectors followed by the original inputs hidden states. The output passed to the subsequent layer, `overline{H}^{l+1}`, is correctly obtained by applying the layers transformation *only* to the hidden states corresponding to the original input, ignoring the prefix vectors during the computation.

Learn Before

Output Selection Formula in a Prefix-Tuned Transformer Layer

True/False

Consider a prefix-tuned Transformer layer where the full input H^l is composed of prefix vectors followed by the original input's hidden states. The output passed to the subsequent layer, overline{H}^{l+1}, is correctly obtained by applying the layer's transformation only to the hidden states corresponding to the original input, ignoring the prefix vectors during the computation.

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related