In preparing a data sample for supervised fine-tuning, a common practice is to structure the sample by concatenating the output segment (ysample) and the input segment (xsample) into a single sequence: sample = [ysample, xsample]. What is the primary reason for placing the output segment before the input segment in this structure?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Selective Gradient Propagation for Sub-sequence Loss
Sample-wise Negative Log-Likelihood Loss for a Sub-sequence
For a supervised fine-tuning task, a single training instance consists of an input segment (
xsample) and a corresponding output segment (ysample). Ifxsampleis 'Instruction: Translate to Spanish. Input: Hello.' andysampleis 'Response: Hola.', which of the following represents the correct structure for the final combined sample that the model will process?Deconstructing a Fine-Tuning Sample
In preparing a data sample for supervised fine-tuning, a common practice is to structure the sample by concatenating the output segment (
ysample) and the input segment (xsample) into a single sequence:sample = [ysample, xsample]. What is the primary reason for placing the output segment before the input segment in this structure?