Essay

Critique of a Pre-training Task Design

A research team is designing a pre-training task for a language model to help it understand discourse coherence. The task involves presenting the model with two sentences and having it predict whether the second sentence immediately follows the first. For negative examples (non-consecutive sentences), the team pairs a sentence from one document with a sentence from a completely unrelated document (e.g., one from a history text and one from a biology text). Analyze why this design might fail to teach the model a sophisticated understanding of sentence-to-sentence coherence. Specifically, explain how the relative ease of this task could encourage the model to adopt a simplistic learning strategy.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science