1Cademy - Maximum Likelihood Training Objective for a Dataset of Sequences

Learn Before

Formula

Maximum Likelihood Training Objective for a Dataset of Sequences

The training objective under the Maximum Likelihood Estimation (MLE) framework is to find the model parameters, $\tilde{\theta}$ , that maximize the total log-probability of all sequences in a dataset $\mathcal{D}$ . This is achieved by summing the log-probabilities of each individual sequence, seq, as calculated by the model parameterized by $\theta$ . The general objective is formally expressed as: $\tilde{\theta} = \arg\max_{\theta} \sum_{\text{seq} \in \mathcal{D}} \log \text{Pr}_{\theta}(\text{seq})$ For datasets composed of input-output pairs $(\mathbf{x}, \mathbf{y})$ , this objective can be specified as maximizing the joint log-probability of the concatenated sequences: $\tilde{\theta} = \underset{\theta}{\arg\max} \sum_{(\mathbf{x},\mathbf{y})\in\mathcal{D}} \log \text{Pr}_{\theta}(\text{seq}_{\mathbf{x},\mathbf{y}})$ This approach is equivalent to maximizing the sum of the log-likelihoods for all data points in the training set.

Updated 2026-05-02

Contributors are:

Who are from:

References

Learn Before

Related

Learn After