Token-Level Conditional Log-Probability in Supervised Fine-Tuning
The conditional log-probability for an entire output sequence given an input is computed at the token level using the chain rule. For an output sequence of length , the objective sums the log-probabilities of each token , conditioned on both the input and all preceding tokens in the output sequence . This is expressed mathematically as: . Minimizing this conditional log-probability is mathematically equivalent to minimizing the cross-entropy loss.
0
1
Tags
Foundations of Large Language Models
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Notational Simplification in Fine-Tuning Formulas
A language model is being fine-tuned on a dataset of customer support chat logs to improve its ability to generate helpful responses. The training process is guided by the objective function: During one step of this process, the model processes a single
(query, response)pair from the dataset. What is the role of the specific componentlog Pr(response|query)for this single pair?The following equation represents the primary goal of a common model training process. Match each mathematical symbol from the equation to its correct description.
Evaluating an Alternative Fine-Tuning Objective
Token-Level Conditional Log-Probability in Supervised Fine-Tuning