Learn Before
A text-to-text model is being trained on the following data sample formatted as 'input → output':
summarize: The solar system consists of the Sun and the astronomical objects gravitationally bound to it. Of the eight planets, the four inner terrestrial planets are Mercury, Venus, Earth, and Mars, and the four outer giant planets are Jupiter, Saturn, Uranus, and Neptune. → The solar system has eight planets, divided into inner terrestrial and outer giant groups.
Which part of this sample represents the correct, or ground-truth, label that the model is expected to learn to produce?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A text-to-text model is being trained on the following data sample formatted as 'input → output':
summarize: The solar system consists of the Sun and the astronomical objects gravitationally bound to it. Of the eight planets, the four inner terrestrial planets are Mercury, Venus, Earth, and Mars, and the four outer giant planets are Jupiter, Saturn, Uranus, and Neptune. → The solar system has eight planets, divided into inner terrestrial and outer giant groups.Which part of this sample represents the correct, or ground-truth, label that the model is expected to learn to produce?
Analyzing Training Data Quality
Impact of Incorrect Ground-Truth Labels