1Cademy - Imagine two language models are being trained on the same large text corpus. Model As task is to read an entire sentence and predict a single label for it (e.g., positive sentiment or negative sentiment). Model Bs task is to read the same sentence, but for every individual word, it must predict whether that word has been artificially replaced with a different, plausible-sounding word. Which statement best analyzes the fundamental difference in the learning signals these two models receive?

Learn Before

Per-Token Classification for Encoder Training

Multiple Choice

Imagine two language models are being trained on the same large text corpus. Model A's task is to read an entire sentence and predict a single label for it (e.g., 'positive sentiment' or 'negative sentiment'). Model B's task is to read the same sentence, but for every individual word, it must predict whether that word has been artificially replaced with a different, plausible-sounding word. Which statement best analyzes the fundamental difference in the learning signals these two models receive?

Updated 2025-10-01

Contributors are:

Who are from:

Learn Before

Related