1Cademy - Maximum Likelihood Estimation (MLE) as the Objective for Supervised Fine-Tuning

Input: What is the capital of France?
Correct Response: Paris

Learn Before

Objective of Supervised Fine-Tuning
Supervised Fine-Tuning (SFT) as an Example of Labeled Data Fine-Tuning

Formula

Maximum Likelihood Estimation (MLE) as the Objective for Supervised Fine-Tuning

In Supervised Fine-Tuning (SFT), the training objective is to maximize the probability of the model generating a "gold-standard" or ground-truth output ( $y$ ) given a specific input ( $x$ ). This is achieved through Maximum Likelihood Estimation (MLE), where the model's parameters are adjusted to make its predicted token distributions align as closely as possible with the one-hot encoded distributions of the correct response. The formal objective is to maximize the conditional probability: $max \ Pr(y|x)$ .