Concept

Flexibility of Masked Language Modeling for Encoder-Decoder Training

The Masked Language Modeling (MLM) framework offers significant flexibility for training encoder-decoder models. Different training objectives can be created by adjusting various parameters, such as the percentage of tokens that are masked and the maximum length of the text spans that are replaced by a mask token. This adaptability allows the training to range from a BERT-style objective with partial masking to a full language modeling task where the entire sequence is generated.

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related