1Cademy - SFT as Language Model Training on Concatenated Sequences

Learn Before

Relationship Between Joint, Conditional, and Marginal Log-Probabilities of Sequences
Probabilistic Objective of Supervised Fine-Tuning

Concept

SFT as Language Model Training on Concatenated Sequences

Supervised Fine-Tuning (SFT) can be framed as a standard language model training process. This is accomplished by defining the objective function as the joint log-probability of the concatenated input x and output y sequences, log Prθ(seq_x,y). In this setup, the model learns to predict the entire sequence, but the training loss is computed exclusively from the output y portion, which aligns the training with the goal of maximizing the conditional probability Pr(y|x).