Multiple Choice

An engineer is adapting a pre-trained language model for a new task. They want to add a small number of trainable vectors to guide the model's behavior without changing any of the original model weights. What is the fundamental architectural difference between a strategy that adds these vectors only to the input embedding layer versus one that adds them to the input of every transformer layer?

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science