Analyzing the Utility of an Adapter Layer
Consider two sequence-to-sequence models. Model A uses a powerful, pre-trained language model as its encoder and a randomly initialized model as its decoder. Model B trains both its encoder and decoder together from scratch on the same task. Explain why an intermediate 'adapter' layer between the encoder and decoder would likely provide a more significant performance benefit for Model A compared to Model B.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An engineer is constructing a text summarization model by using a large, pre-trained language model as the encoder and a separate, newly initialized transformer as the decoder. The engineer observes that the model struggles to learn effectively. They hypothesize that the rich, complex output vectors from the pre-trained encoder are not in a format that the new decoder can easily interpret. Which of the following strategies directly addresses this specific problem by creating a bridge between the two components?
Analyzing the Utility of an Adapter Layer
Diagnosing a Sequence-to-Sequence Model Failure