Concept

Corrupted Input for Encoder-Decoder Pre-training

When pre-training an encoder-decoder model using either BERT-style or denoising autoencoding methods, the initial step involves processing data through the encoder. The input provided to the encoder is a corrupted token sequence where some specific tokens have been intentionally masked and replaced with a special placeholder, such as [MASK] (or [M] for short).

Image 0

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences