1Cademy - Comparison of Pre-training Tasks

Learn Before

Pre-training tasks

Comparison

Comparison of Pre-training Tasks

Pre-training tasks can be compared based on their input-output transformations and their applicability to specific model architectures. Language modeling variants (such as Causal and Prefix LM) focus on sequential text generation and are typically applied to decoder-only and encoder-decoder models. Masked language modeling approaches (e.g., MASS-style, BERT-style) rely on reconstructing masked tokens and are compatible with both encoder-only and encoder-decoder architectures. Permuted language modeling and discriminative training methods (like Next Sentence Prediction, Sentence Comparison, and Token Classification) are specifically tailored for encoder-only models. Finally, denoising autoencoding encompasses tasks such as token reordering, token deletion, span masking, sentinel masking, sentence reordering, and document rotation, which train encoder-decoder models to reconstruct original text from corrupted inputs.

0

1

Updated 2026-04-17

Contributors are:

Who are from:

References

Learn Before

Related