1Cademy - Model Usage of Transformers

Relation

Model Usage of Transformers

Encoder-Decoder: sequence to sequence (language modeling)
Encoder Only: outputs of the encoder are utilized as a representation for the input sequence. This is usually used for classification or sequence labeling problems (i.e. BERT)
Decoder Only: cross-attention module is removed; this is typically used for sequence generation, such as language modeling (i.e. GPT)

Updated 2025-10-10

Contributors are: