Architectural Suitability for Language Processing Tasks
Imagine you are designing a system for two different natural language tasks.
- Task A: Classifying the sentiment of a customer review (e.g., positive or negative). This requires a deep understanding of the entire review's context.
- Task B: Translating a sentence from English to French. This requires understanding the source sentence and then generating a new sentence in the target language.
Compare and contrast the suitability of an "encoder-only" architecture versus an "encoder-decoder" architecture for these two tasks. In your analysis, explain how the flow of information in each architecture makes it either well-suited or poorly-suited for the specific demands of Task A and Task B.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Examples of Pre-trained Transformers by Architecture
Architectural Suitability for Language Processing Tasks
A research team is developing a system to automatically summarize long scientific articles. The system needs to first read and understand the entire source article and then generate a concise, new paragraph that captures the key findings. Which architectural category of pre-trained model is fundamentally best suited for this kind of sequence-to-sequence task?
Match each pre-trained model architectural category to the description of its primary information processing design.