Learn Before
Activity (Process)

Pre-training Encoder-Decoder Models via Masked Language Modeling

A second method for pre-training encoder-decoder models involves masked language modeling. In this technique, specific tokens within an input sequence are randomly substituted with a mask symbol. The model is subsequently trained to predict the identities of these masked tokens by analyzing the entirety of the masked sequence.

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related