SFT as Language Model Training on Concatenated Sequences
Supervised Fine-Tuning (SFT) can be framed as a standard language model training process. This is accomplished by defining the objective function as the joint log-probability of the concatenated input x and output y sequences, log Prθ(seq_x,y). In this setup, the model learns to predict the entire sequence, but the training loss is computed exclusively from the output y portion, which aligns the training with the goal of maximizing the conditional probability Pr(y|x).

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
SFT as Language Model Training on Concatenated Sequences
Calculating Conditional Log-Probability Using an LLM
Selective Loss Computation in Joint Probability Language Modeling
Calculating Conditional Log-Probability
An engineer is evaluating a language model and calculates the following log-probabilities for an input sequence
xand an output sequencey: the joint log-probabilitylog Pr([x, y])and the marginal log-probabilitylog Pr(x). They observe that the value oflog Pr([x, y])is significantly more negative than the value oflog Pr(x). Based on the fundamental relationship between joint, conditional, and marginal probabilities, what is the most accurate conclusion?A language model is being evaluated. For a given input sequence
xand a potential output sequencey, the model calculateslog Pr([x, y]) = -3.5andlog Pr(x) = -5.2. Based on these values, it is reasonable to conclude that the model's probability calculations are functioning correctly.Mathematical Formulation of the Supervised Fine-Tuning Objective
SFT as Language Model Training on Concatenated Sequences
A development team starts with a large, pre-trained language model. Their goal is to make this model a specialized chatbot for their company's products. To do this, they use a curated dataset of high-quality, product-related conversations. Which statement best represents the primary mathematical objective of this specialization process?
Deconstructing the Supervised Fine-Tuning Objective
Evaluate the following statement: The objective of supervised fine-tuning is to discover an entirely new set of model parameters from a random initialization, achieved by minimizing a function over the vast dataset originally used for pre-training the model.
Learn After
SFT Objective as Maximizing Joint Log-Probability of Concatenated Sequences
In a common fine-tuning strategy, a prompt and its desired completion are concatenated into a single sequence (e.g.,
[prompt_tokens, completion_tokens]). The language model is then trained on this full sequence, but the training loss is calculated only for the model's predictions on the completion tokens. What is the most accurate analysis of the primary purpose of this specific loss calculation method?During supervised fine-tuning, if a model is trained on concatenated
[input, output]sequences and the training loss is calculated across the entire sequence (both input and output tokens), the model is still being optimized primarily to improve its conditional generation capabilities for the given input.Diagnosing a Faulty Fine-Tuning Process
Loss Masking via Forward and Backward Passes in SFT