Simplicity of NSP Task as a Cause for Reliance on Superficial Cues
The Next Sentence Prediction (NSP) task can lead a model to rely on superficial evidence because the prediction challenge is often not difficult. When presented with sentence pairs that have an obvious connection, the model can learn to identify simple correlations rather than developing a more robust, deeper understanding of semantic coherence.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
A language model is being trained on a task where it must determine if Sentence B is the actual sentence that follows Sentence A in a document. Which of the following training pairs is most likely to encourage the model to learn a simple, superficial shortcut for this task, rather than developing a deeper understanding of semantic coherence?
Simplicity of NSP Task as a Cause for Reliance on Superficial Cues
Diagnosing a Language Model's Flawed Coherence Judgment
Unintended Learning in Sentence Relationship Models
Learn After
A language model is pre-trained on a task where it must determine if two sentences are consecutive in a text. When presented with the pair: 'The children went to the river bank to skip stones.' and 'The First National Bank offers competitive loan rates.', the model incorrectly classifies them as consecutive. Which of the following best explains why the model made this specific error?
Diagnosing Flawed Model Behavior
Critique of a Pre-training Task Design