1Cademy - A researcher is comparing two different methods for adapting a pre-trained transformer model, keeping the original model weights frozen. Method A prepends a sequence of trainable vectors to the input sequence *before* it enters the first layer. Method B prepends a sequence of trainable vectors to the sequence of hidden states *at each layer* of the model. Which statement best analyzes the architectural difference in how these methods influence the models processing?

Learn Before

Architecture of Prefix Tuning

Multiple Choice

A researcher is comparing two different methods for adapting a pre-trained transformer model, keeping the original model weights frozen. Method A prepends a sequence of trainable vectors to the input sequence before it enters the first layer. Method B prepends a sequence of trainable vectors to the sequence of hidden states at each layer of the model. Which statement best analyzes the architectural difference in how these methods influence the model's processing?

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related