Concept

GAN-based Training for Replaced Token Detection

An alternative to the standard joint training in Replaced Token Detection is to use a Generative Adversarial Network (GAN) framework. In this setup, the generator's objective shifts from simple prediction to actively trying to fool the discriminator. The discriminator, in turn, is trained to distinguish between the generator's output and the original data distribution. However, this adversarial approach tends to complicate the training process and is generally more difficult to scale effectively for this task.

0

1

Updated 2026-04-16

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course