Learn Before
Sampling in Self-Instruct
In the Self-Instruct cycle, a few existing instructions are drawn from the task pool. These sampled instructions serve as in-context examples to prompt the Large Language Model for the generation of a new, related instruction.

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Sampling in Self-Instruct
A research team is initiating a process to enhance a language model's ability to generate Python code from natural language descriptions. For their initial task pool, they gather 1,000 random Python code snippets from public repositories. Based on the principles of initializing this process, what is the primary weakness of their approach?
Evaluating Seed Task Suitability
In a methodology designed to bootstrap a large set of instructional data, the initial 'seed' tasks used to start the process are typically generated automatically by a language model.
Learn After
Instruction Generation in Self-Instruct
A team is developing a system to automatically generate new instructional tasks for a large language model. The system works by first selecting a few existing tasks from a large pool to serve as examples. In one run, the system selects three examples that are all variations of the same task: 'Sort a list of integers in ascending order.' What is the most probable outcome when these highly similar examples are used to prompt the model to generate a new instruction?
Critique of a Sampling Strategy for Instruction Generation
Diagnosing a Flaw in an Instruction Generation Pipeline