1Cademy - A model is being trained by maximizing the sum of log-probabilities for a dataset of 1,000 examples. Consider two scenarios for a single training update: Scenario A: The probability assigned to the correct output for one example improves from 0.1 to 0.2. The probabilities for all other 999 examples remain unchanged. Scenario B: The probability assigned to the correct output for one example improves from 0.8 to 0.9. The probabilities for all other 999 examples remain unchanged. Which scenario leads to a larger increase in the overall training objective function, and why?

Learn Before

Parameter Estimation via Conditional Log-Likelihood Maximization

Multiple Choice

A model is being trained by maximizing the sum of log-probabilities for a dataset of 1,000 examples. Consider two scenarios for a single training update:

Scenario A: The probability assigned to the correct output for one example improves from 0.1 to 0.2. The probabilities for all other 999 examples remain unchanged.

Scenario B: The probability assigned to the correct output for one example improves from 0.8 to 0.9. The probabilities for all other 999 examples remain unchanged.

Which scenario leads to a larger increase in the overall training objective function, and why?

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related