Learn Before
Choosing a Training Objective for Error Detection
A researcher wants to train a language model to be highly effective at identifying the specific location of grammatical errors within long sentences. They are considering two self-supervised training objectives:
Objective 1: The model reads an entire sentence and outputs a single label: 'grammatically correct' or 'contains an error'.
Objective 2: The model reads an entire sentence and, for each individual word, outputs a label: 'correct' or 'incorrect'.
Which objective is more suitable for the researcher's goal? Justify your answer by explaining the difference in the supervision signal provided by each approach.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Replaced Token Detection as a Self-Supervised Task
Imagine two language models are being trained on the same large text corpus. Model A's task is to read an entire sentence and predict a single label for it (e.g., 'positive sentiment' or 'negative sentiment'). Model B's task is to read the same sentence, but for every individual word, it must predict whether that word has been artificially replaced with a different, plausible-sounding word. Which statement best analyzes the fundamental difference in the learning signals these two models receive?
Choosing a Training Objective for Error Detection
Evaluating Language Model Training Objectives