Learn Before
Diagnosing Model Training Issues from Data Formatting
A data scientist is training a language model for a text summarization task. They have combined the original long text and its corresponding summary into a single sequence for each training example. However, the model is struggling to learn the task and is generating incoherent outputs. Below is an example of how one data point was formatted:
'The Industrial Revolution was the transition to new manufacturing processes... [full long text] ... The Industrial Revolution transformed economies.'
Based on common practices for preparing sequence data, identify the likely error in this formatting and explain why it would cause the model to perform poorly.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being prepared for a question-answering task. The model must process both the question and its corresponding answer as a single, combined sequence. If the question is 'What is the capital of France?' and the answer is 'Paris', how should these two sequences be formatted for the model using a special separator token to distinguish between them?
Diagnosing Model Training Issues from Data Formatting
Debugging Data Preprocessing for a Summarization Model
Example of Sequence Packing for Translation