Short Answer

Deconstructing the Supervised Fine-Tuning Objective

A machine learning model is refined using a specialized dataset. The core objective of this refinement process is captured by the following mathematical expression:

θ~=argmaxθ^+sampleDtuneLθ^+(sample)\tilde{\theta} = \arg \max_{\hat{\theta}^{+}} \sum_{\text{sample} \in D_{tune}} L_{\hat{\theta}^{+}}(\text{sample})

Based on this expression, explain the relationship between the initial model parameters (represented by θ^+\hat{\theta}^{+}) and the final model parameters (represented by θ~\tilde{\theta}), and describe the role of the dataset (DtuneD_{tune}) in this transformation.

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science