Learn Before
Multi-Layer Input Composition in Prefix-Tuning
Consider a 3-layer Transformer model where a tuning method involves prepending a unique set of trainable vectors (a 'prefix') to the input of each layer. The input to the first layer (Layer 1) is formed by concatenating its prefix, P1, with the initial sequence embeddings, H0. The output passed from Layer 1 to Layer 2 consists only of the hidden states corresponding to the original sequence, which we'll call H1. Based on this established pattern, describe the precise composition of the complete input that Layer 3 will receive.
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a multi-layer transformer model adapted for prefix-based tuning, the input to any given layer
Lis formed by prepending a set of layer-specific trainable vectors (the 'prefix') to the sequence representation from the previous layer. After all computations within layerLare finished, what is the precise composition of the input sequence for the next layer,L+1?A single layer in a multi-layer model has been adapted for a tuning method where a set of trainable vectors (a 'prefix') is used. Arrange the following steps to accurately describe the complete data flow from the moment data enters this single layer until it is passed to the next.
Multi-Layer Input Composition in Prefix-Tuning