1Cademy - SFT Objective as Maximizing Joint Log-Probability of Concatenated Sequences

Learn Before

SFT as Language Model Training on Concatenated Sequences

Formula

SFT Objective as Maximizing Joint Log-Probability of Concatenated Sequences

When Supervised Fine-Tuning (SFT) is framed as a standard language model training task, the objective is to find the parameters $\tilde{\theta}$ that maximize the sum of the log-probabilities of the concatenated input-output sequences across the entire dataset $\mathcal{D}$ . This is formally expressed as: $\tilde{\theta} = \arg\max_{\theta} \sum_{(\mathbf{x},\mathbf{y}) \in \mathcal{D}} \log \mathrm{Pr}_{\theta}(\mathrm{seq}_{\mathbf{x},\mathbf{y}})$ . By taking $\log \mathrm{Pr}_{\theta}(\mathrm{seq}_{\mathbf{x},\mathbf{y}})$ as the objective function, SFT can be implemented using standard LLMs, treating the combined input $\mathbf{x}$ and output $\mathbf{y}$ as a single sequence for the model to process.

0

1

Updated 2026-05-01

Contributors are:

Who are from:

References

Learn Before

Related

Learn After