1Cademy - Consider a pre-training method for a language model that uses two components. The first component, a generator, takes an original sentence and replaces a few words with other plausible words. The second component, a discriminator, then reads this modified sentence. The discriminators task is to examine every single word in the modified sentence and decide for each one: Is this word from the original sentence, or is it a replacement? What is the primary advantage of training the discrimina

Learn Before

Replaced Token Detection as a Self-Supervised Task

Multiple Choice

Consider a pre-training method for a language model that uses two components. The first component, a 'generator', takes an original sentence and replaces a few words with other plausible words. The second component, a 'discriminator', then reads this modified sentence. The discriminator's task is to examine every single word in the modified sentence and decide for each one: 'Is this word from the original sentence, or is it a replacement?' What is the primary advantage of training the discrimina

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related