Learn Before
Relation

Model Usage of Transformers

  • Encoder-Decoder: sequence to sequence (language modeling)
  • Encoder Only: outputs of the encoder are utilized as a representation for the input sequence. This is usually used for classification or sequence labeling problems (i.e. BERT)
  • Decoder Only: cross-attention module is removed; this is typically used for sequence generation, such as language modeling (i.e. GPT)

0

1

Updated 2025-10-10

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Related