Learn Before
Diagnosing a Model's Infilling Errors
An engineer is training a model to fill in masked text. They observe a recurring issue: when a multi-word phrase is replaced by a single [MASK] token, the model often generates a plausible but shorter phrase than the original. For example, given the input 'The artist carefully mixed the [MASK] on their palette,' where the original text was 'bright crimson paint,' the model frequently outputs just 'paint.' Based on this pattern of errors, what specific information, lost during the masking process, is the model failing to predict correctly?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Analogy between Span Length Prediction and Fertility Modeling
A language model is tasked with filling in a masked portion of text. The input provided is: 'The solar system consists of the Sun and the
[MASK]that orbit it.' The original, unmasked text was: 'The solar system consists of the Sun and the celestial bodies that orbit it.' The model, however, generates the following output: 'The solar system consists of the Sun and the planets that orbit it.' Which statement provides the most accurate analysis of the model's primary error in this specific case?The Variable-Length Prediction Problem
Diagnosing a Model's Infilling Errors