1Cademy - Mathematical Formulation of the Supervised Fine-Tuning Objective

Learn Before

Formula

Mathematical Formulation of the Supervised Fine-Tuning Objective

In supervised fine-tuning (SFT), the goal is to adjust pre-trained model parameters to maximize the conditional probability of the target output sequence $\mathbf{y}$ given the input sequence $\mathbf{x}$ . Given pre-trained parameters $\hat{\theta}$ and a dataset $\mathcal{D}$ of input-output pairs, the objective is to find the optimized parameters $\tilde{\theta}$ by maximizing the sum of conditional log-probabilities: $\tilde{\theta} = \arg\max_{\hat{\theta}^+} \sum_{(\mathbf{x},\mathbf{y}) \in \mathcal{D}} \log \mathrm{Pr}_{\hat{\theta}^+}(\mathbf{y}|\mathbf{x})$ . This formulation highlights that optimization starts from the pre-trained weights rather than from random initialization.

Updated 2026-04-30

Contributors are:

Who are from:

References

Learn Before

Related

Learn After