Diagnosing a Sequence-to-Sequence Model Failure
Given the following scenario, identify the most likely missing architectural component and explain why its absence would lead to the described performance issues.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
An engineer is constructing a text summarization model by using a large, pre-trained language model as the encoder and a separate, newly initialized transformer as the decoder. The engineer observes that the model struggles to learn effectively. They hypothesize that the rich, complex output vectors from the pre-trained encoder are not in a format that the new decoder can easily interpret. Which of the following strategies directly addresses this specific problem by creating a bridge between the two components?
Analyzing the Utility of an Adapter Layer
Diagnosing a Sequence-to-Sequence Model Failure