Selecting Pre-trained Model Architectures for Specific Tasks
For each of the two systems described in the case study, determine which type of pre-trained model architecture would be most effective: an encoder-only, a decoder-only, or an encoder-decoder structure. Justify your choices by explaining how the architecture's design aligns with the specific requirements of each task.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Selecting Pre-trained Model Architectures for Specific Tasks
Self-supervised pre-training can be applied to different underlying model structures to create systems optimized for specific kinds of tasks. Match each model architecture with the description of the task it is most fundamentally suited for.
A team is building a foundation model intended primarily for abstractive summarization tasks, which require processing a source document and generating a new, coherent summary. They choose a full encoder-decoder architecture for self-supervised pre-training. What is the most critical reason this architecture is better suited for this task than an encoder-only or decoder-only model?