Mathematical Formulation of the Supervised Fine-Tuning Objective
In supervised fine-tuning (SFT), the goal is to adjust pre-trained model parameters to maximize the conditional probability of the target output sequence given the input sequence . Given pre-trained parameters and a dataset of input-output pairs, the objective is to find the optimized parameters by maximizing the sum of conditional log-probabilities: . This formulation highlights that optimization starts from the pre-trained weights rather than from random initialization.

0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Related
Mathematical Formulation of the Supervised Fine-Tuning Objective
SFT as Language Model Training on Concatenated Sequences
A development team starts with a large, pre-trained language model. Their goal is to make this model a specialized chatbot for their company's products. To do this, they use a curated dataset of high-quality, product-related conversations. Which statement best represents the primary mathematical objective of this specialization process?
Deconstructing the Supervised Fine-Tuning Objective
Evaluate the following statement: The objective of supervised fine-tuning is to discover an entirely new set of model parameters from a random initialization, achieved by minimizing a function over the vast dataset originally used for pre-training the model.
Mathematical Formulation of the Supervised Fine-Tuning Objective
A machine learning engineer is performing supervised fine-tuning on a pre-trained language model. The process involves three distinct states for the model's parameters:
- The initial parameters loaded from the pre-trained model before any new training begins.
- The parameters as they are being iteratively updated by the optimization algorithm on the new dataset.
- The final, converged parameters after the fine-tuning process is complete.
Which option correctly maps the standard notation to these three states?
Notation for Predicted Output During Fine-Tuning
In the mathematical description of a model fine-tuning process, different symbols are used to represent the model's parameters at various stages. Match each symbol with its correct description.
Correcting Fine-Tuning Parameter Notation
Optimal Parameters Formula in Fine-Tuning
Maximum Likelihood Estimation (MLE) as the Objective for Supervised Fine-Tuning
A development team is fine-tuning a pre-trained language model using a curated dataset of customer support inquiries (inputs) and their corresponding ideal, human-written responses (outputs). The aim is to create a specialized chatbot that reliably provides answers in the same helpful and accurate style as the examples. From a probabilistic perspective, which statement best describes the fundamental objective of this training process?
Correcting a Flawed Fine-Tuning Objective
Objective for a Specialized Math Tutor
Mathematical Formulation of the Supervised Fine-Tuning Objective
Conditional vs. Joint Probability Objectives in Language Modeling
Learn After
Notational Simplification in Fine-Tuning Formulas
A language model is being fine-tuned on a dataset of customer support chat logs to improve its ability to generate helpful responses. The training process is guided by the objective function: During one step of this process, the model processes a single
(query, response)pair from the dataset. What is the role of the specific componentlog Pr(response|query)for this single pair?The following equation represents the primary goal of a common model training process. Match each mathematical symbol from the equation to its correct description.
Evaluating an Alternative Fine-Tuning Objective
Token-Level Conditional Log-Probability in Supervised Fine-Tuning