1Cademy - Within a single layer of a Transformer model during inference, a sequence of input vectors is processed through a two-step sequence. Which statement best analyzes the distinct roles of the self-attention mechanism and the subsequent Feed-Forward Network (FFN) in this process?

Learn Before

Layer-wise Processing in Transformer Inference

Multiple Choice

Within a single layer of a Transformer model during inference, a sequence of input vectors is processed through a two-step sequence. Which statement best analyzes the distinct roles of the self-attention mechanism and the subsequent Feed-Forward Network (FFN) in this process?

Updated 2025-09-26

Contributors are: