Model Comparison using Conditional Log-Likelihood
A language model is being trained on a small dataset for a classification task. You are evaluating two different sets of model parameters, and . For a batch of two examples, the models produce the following probabilities for the correct target classes:
Example 1:
- Model A:
- Model B:
Example 2:
- Model A:
- Model B:
Calculate the total conditional log-likelihood for the batch for each set of parameters. Based on your calculation, which set of parameters ( or ) is considered a better fit for this data? Use the natural logarithm (ln) for your calculations.
0
1
Tags
Data Science
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Language Model as a Stochastic Policy
Plackett-Luce Loss Function
A model is being trained by maximizing the sum of log-probabilities for a dataset of 1,000 examples. Consider two scenarios for a single training update:
Scenario A: The probability assigned to the correct output for one example improves from 0.1 to 0.2. The probabilities for all other 999 examples remain unchanged.
Scenario B: The probability assigned to the correct output for one example improves from 0.8 to 0.9. The probabilities for all other 999 examples remain unchanged.
Which scenario leads to a larger increase in the overall training objective function, and why?
Model Comparison using Conditional Log-Likelihood
Evaluating a Training Update