Short Answer

Critique of Negative Sample Generation

A common method for creating a training dataset to determine if two sentences are consecutive is to pair a sentence with a random sentence from a different document to create a 'negative' example. Evaluate a potential weakness of this approach. Specifically, what kind of subtle sentence relationships might the model fail to learn to distinguish?

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science