Concept

Model Usage After Replaced Token Detection Training

Once the training for Replaced Token Detection is finished, the two models involved have different fates. The generator, having served its purpose of creating a challenging training task, is discarded. The discriminator's encoder, which has learned rich contextual representations, is preserved and used as the pre-trained model for various downstream natural language understanding tasks.

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences