Short Answer

Information Flow in a Multi-Layer Tuning Process

Consider a diagram illustrating a parameter-efficient tuning method for a large language model. For an arbitrary layer l in the model, the diagram shows a sequence of new, trainable vectors being introduced. These vectors are combined with the sequence of hidden states passed from the previous layer (l-1). This combined sequence then serves as the input to the main computational block of layer l. Based on this described process, explain two key aspects:

  1. What is the specific operation used to combine the new trainable vectors with the hidden states from the previous layer?
  2. During the training process, which set of parameters is modified to minimize the task-specific error: the new trainable vectors, the original weights of the main computational block of layer l, or both?

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.3 Prompting - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science