Probabilistic Objective of Supervised Fine-Tuning
The objective of supervised fine-tuning is to determine the optimal model parameters, , by maximizing an objective function, , over all samples in the fine-tuning dataset, . The optimization process begins with the parameters initialized from the pre-trained model, denoted as . The formal mathematical representation of this objective is: This equation frames fine-tuning as a maximization problem, which typically corresponds to maximizing the likelihood of the training data.

0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.2 Generative Models - Foundations of Large Language Models
Related
Parameter Initialization and Moderate Adjustment in Fine-Tuning
Probabilistic Objective of Supervised Fine-Tuning
Comparison of Training Objectives: Instruction Fine-Tuning vs. Pre-training
A machine learning team has successfully developed a large language model by training it on a massive, general-purpose text corpus. They now want to make the model better at following specific user commands. To do this, they have created a new, high-quality dataset that is much smaller than the original corpus and consists of example commands paired with ideal responses. Based on the standard procedures for adapting such models, which statement best describes the relationship between the initial training phase and this new adaptation phase?
Training Strategy for a Specialized Chatbot
The training methodology for instruction fine-tuning must be fundamentally different from the methodology used for pre-training, primarily because the dataset used for fine-tuning is substantially smaller.
Objective of Instruction Fine-Tuning
Learn After
Mathematical Formulation of the Supervised Fine-Tuning Objective
SFT as Language Model Training on Concatenated Sequences
A development team starts with a large, pre-trained language model. Their goal is to make this model a specialized chatbot for their company's products. To do this, they use a curated dataset of high-quality, product-related conversations. Which statement best represents the primary mathematical objective of this specialization process?
Deconstructing the Supervised Fine-Tuning Objective
Evaluate the following statement: The objective of supervised fine-tuning is to discover an entirely new set of model parameters from a random initialization, achieved by minimizing a function over the vast dataset originally used for pre-training the model.