Learn Before
A researcher is adapting a large pre-trained language model for a new task. Instead of modifying the model's original parameters, they introduce a small set of new, trainable vectors. These vectors are prepended to the sequence of hidden states at the input of every transformer layer. During training, only these new vectors are updated. Which statement best analyzes the primary impact of this technique on the model's computation?
0
1
Tags
Ch.3 Prompting - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A researcher is adapting a large pre-trained language model for a new task. Instead of modifying the model's original parameters, they introduce a small set of new, trainable vectors. These vectors are prepended to the sequence of hidden states at the input of every transformer layer. During training, only these new vectors are updated. Which statement best analyzes the primary impact of this technique on the model's computation?
Improving a Parameter-Efficient Fine-Tuning Strategy
An engineer is adapting a large language model for a specialized task by introducing a set of trainable vectors. These vectors are prepended to the sequence of hidden states at the input of every layer in the model. During the adaptation process, the original model parameters remain unchanged, and only these new vectors are optimized. What is the most significant advantage of this specific approach compared to a method that only adds trainable vectors at the initial input layer?
Illustration of Prefix Fine-Tuning