Learn Before
Dataset Strategy for Model Specialization
Based on the provided scenario, evaluate which of the two approaches is more suitable for the supervised fine-tuning phase of this project. Justify your choice by explaining how the characteristics of the dataset generated by your chosen approach align with the goals of this training phase.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A machine learning engineer is preparing data to teach a pre-trained language model how to follow specific user commands. Below is a single example entry from the dataset they are creating:
{ "instruction": "Summarize the following text into a single sentence: 'The sun is a star at the center of the Solar System. It is a nearly perfect sphere of hot plasma, with internal convective motion that generates a magnetic field via a dynamo process. It is by far the most important source of energy for life on Earth.'", "output": "The sun, a star at the center of our Solar System, is a sphere of hot plasma that generates a magnetic field and is the primary energy source for life on Earth." }Based on the structure and content of this data point, what is its primary purpose in this training phase?
Dataset Strategy for Model Specialization
Crafting an SFT Data Point