Learn Before
Structure of a Task Sample in Self-Instruct
Within the Self-Instruct framework, each task sample in the pool is structured as a triplet consisting of an instruction, a user-input, and the corresponding output. The initial seed tasks are hand-crafted in this format, and new samples generated by the LLM also adhere to this structure.
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Structure of a Task Sample in Self-Instruct
An engineer is implementing a process to generate training data. The process begins with 100 manually-created instructional prompts. In each cycle, the system uses a language model to generate 20 new prompts, which are then reviewed for quality and added to the existing set. Which statement best analyzes the state of the prompt collection after 10 successful cycles?
A team is developing a system to generate instructional data. They begin with a fixed set of 500 human-written tasks. A language model is then prompted using only these 500 tasks to generate thousands of new examples. The newly generated instructions are collected for the final dataset but are never added back to the original pool of 500 tasks. What is the most significant limitation of this approach?
A team is using an automated process to expand a collection of instructional tasks, starting from a small set of human-written examples. Arrange the following events to show the correct sequence for how a single new, high-quality task is generated and integrated into the collection.
Learn After
A developer is creating a new task to add to a collection used for fine-tuning a large language model. The goal is to teach the model how to classify the sentiment of a movie review. Which of the following options correctly structures this new task sample as a distinct instruction, a specific input to which the instruction is applied, and the corresponding desired output?
A language model is being trained with samples that follow a specific three-part structure. Analyze the following sample and match each label (Instruction, Input, Output) to its corresponding component (Component A, Component B, Component C).
Sample:
- Component A: "Write a one-sentence summary of the following text."
- Component B: "The sun is a star at the center of the Solar System. It is a nearly perfect ball of hot plasma, heated to incandescence by nuclear fusion reactions in its core. The Sun radiates this energy mainly as light, ultraviolet, and infrared radiation, and is the most important source of energy for life on Earth."
- Component C: "The sun, a star at the center of our solar system, is a ball of hot plasma that provides Earth with energy through light and radiation."
Creating a Task Sample for a Language Model