1Cademy - Training Objective as Joint Log-Likelihood Maximization of Concatenated Sequences

Objective A: Maximize the sum of log Pr(y|x) over all pairs in the dataset.
Objective B: Maximize the sum of log Pr(sequence) over all pairs, where sequence is the concatenation of x and y .

Learn Before

Fine-Tuning as Maximum Likelihood Estimation

Formula

Training Objective as Joint Log-Likelihood Maximization of Concatenated Sequences

A training objective for a model can be formulated as maximizing the joint log-likelihood of concatenated input-output sequences. For a dataset D of input-output pairs (x, y), the optimal parameters ˜θ are found by maximizing the sum of the log-probabilities of the combined sequence seq_{x,y}. The formula is: $˜θ = \arg \max_{\theta} \sum_{(x,y)∈D} \log \text{Pr}_{\theta}(\text{seq}_{x,y})$ This approach is equivalent to maximizing the conditional log-likelihood log Pr(y|x) when the input distribution Pr(x) is not dependent on the model parameters θ.