1Cademy - Deconstructing the Supervised Fine-Tuning Objective

Learn Before

Probabilistic Objective of Supervised Fine-Tuning

Short Answer

Deconstructing the Supervised Fine-Tuning Objective

A machine learning model is refined using a specialized dataset. The core objective of this refinement process is captured by the following mathematical expression:

$\tilde{\theta} = \arg \max_{\hat{\theta}^{+}} \sum_{\text{sample} \in D_{tune}} L_{\hat{\theta}^{+}}(\text{sample})$

Based on this expression, explain the relationship between the initial model parameters (represented by $\hat{\theta}^{+}$ ) and the final model parameters (represented by $\tilde{\theta}$ ), and describe the role of the dataset ( $D_{tune}$ ) in this transformation.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Learn Before

Related