A key aspect of training text encoders with self-supervision is designing a classification task that forces the model to learn a useful property of language. Match each proposed self-supervised classification task with the primary linguistic property it is designed to teach the model.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Next Sentence Prediction (NSP)
Per-Token Classification for Encoder Training
Designing a Self-Supervised Text Classification Task
A researcher aims to pre-train a text encoder on a large corpus of unlabeled articles. They propose the following self-supervised classification task: For each training instance, a paragraph is extracted. With 50% probability, the sentences within that paragraph are randomly reordered. The model's task is to predict a binary label: 'Original Order' or 'Shuffled Order'. Which statement best evaluates the potential effectiveness of this task for its intended purpose?
A key aspect of training text encoders with self-supervision is designing a classification task that forces the model to learn a useful property of language. Match each proposed self-supervised classification task with the primary linguistic property it is designed to teach the model.