Example

Visual Example of Generator Operation in Replaced Token Detection

To illustrate the generator's role in replaced token detection, an original sequence is first corrupted by masking certain tokens, which are then replaced by the predictions of a small masked language model (the generator). For instance:

original:[CLS]Theboyspenthoursworkingontoys.masked:[CLS]Theboyspent[MASK]workingon[MASK].\multicolumn9cGenerator (small masked language model)replaced:[CLS]Theboyspentdecadesworkingontoys.\begin{array}{r c c c c c c c c c} \text{original:} & [\mathrm{CLS}] & \text{The} & \text{boy} & \text{spent} & \text{hours} & \text{working} & \text{on} & \text{toys} & . \\ & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow \\ \text{masked:} & [\mathrm{CLS}] & \text{The} & \text{boy} & \text{spent} & [\mathrm{MASK}] & \text{working} & \text{on} & [\mathrm{MASK}] & . \\ & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow \\ & \multicolumn{9}{c}{\text{Generator (small masked language model)}} \\ & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow & \downarrow \\ \text{replaced:} & [\mathrm{CLS}] & \text{The} & \text{boy} & \text{spent} & \text{decades} & \text{working} & \text{on} & \text{toys} & . \end{array}

In this example, the masked token "hours" is incorrectly predicted and replaced by "decades", whereas "toys" is correctly predicted and replaced by the original token.

Image 0

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences