Formula

Objective of Instruction Fine-Tuning

The objective of instruction fine-tuning is to optimize the pre-trained model parameters, denoted as θ^\hat{\theta}, using a smaller fine-tuning dataset, Dtune\mathcal{D}_{\mathrm{tune}}. The goal is to maximize the likelihood of generating the desired responses for the samples in the fine-tuning dataset. The objective function is formulated as:

θ~=argmaxθ^+sampleDtuneLθ^+(sample)\tilde{\theta} = \arg\max_{\hat{\theta}^+} \sum_{\mathrm{sample} \in \mathcal{D}_{\mathrm{tune}}} \mathcal{L}_{\hat{\theta}^+}(\mathrm{sample})

where θ~\tilde{\theta} represents the optimized parameters after fine-tuning, and θ^+\hat{\theta}^+ indicates that the optimization starts from the pre-trained parameters.

0

1

Updated 2026-04-19

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences