Learn Before
Match each architectural component with its primary role in a large language model.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Analysis of Expert Networks in Language Model Architecture
A standard transformer-based language model layer consists of a self-attention mechanism followed by a feed-forward network (FFN). An alternative architecture aims for greater parameter capacity and computational efficiency by using a routing mechanism to selectively activate one of several specialized 'expert' sub-networks within each layer for a given input. Based on this design, which component of the standard transformer layer are these 'expert' sub-networks most directly implementing and parallelizing?
Match each architectural component with its primary role in a large language model.