Formula

Probabilistic Objective of Supervised Fine-Tuning

The objective of supervised fine-tuning is to determine the optimal model parameters, θ~\tilde{\theta}, by maximizing an objective function, LL, over all samples in the fine-tuning dataset, DtuneD_{tune}. The optimization process begins with the parameters initialized from the pre-trained model, denoted as θ^+\hat{\theta}^{+}. The formal mathematical representation of this objective is: θ~=argmaxθ^+sampleDtuneLθ^+(sample)\tilde{\theta} = \arg \max_{\hat{\theta}^{+}} \sum_{\text{sample} \in D_{tune}} L_{\hat{\theta}^{+}}(\text{sample}) This equation frames fine-tuning as a maximization problem, which typically corresponds to maximizing the likelihood of the training data.

Image 0

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models