A developer has a large, pre-existing dataset of input-output pairs for a specific text-based task (e.g., a list of questions and their corresponding answers). They want to use this dataset to create thousands of training examples to teach a language model how to perform this task. Arrange the following actions into the correct chronological order to achieve this efficiently.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A machine learning team plans to generate a large dataset to train a language model. Their method involves taking an existing, structured collection of input-output pairs and automatically inserting each pair into a fixed instructional phrase. For example, using a dataset of questions and answers, they could generate training examples like 'Answer the following question: [question]' paired with the corresponding '[answer]'. For which of the following tasks would this specific data generation strategy be the LEAST suitable?
A developer has a large, pre-existing dataset of input-output pairs for a specific text-based task (e.g., a list of questions and their corresponding answers). They want to use this dataset to create thousands of training examples to teach a language model how to perform this task. Arrange the following actions into the correct chronological order to achieve this efficiently.
Efficient Dataset Generation for a Custom NLP Task