1Cademy - A machine learning engineer is training a language model on a text corpus. During training, they plot two values at each step: 1. The average negative log-likelihood of the target sequences. 2. The cross-entropy loss between the models predicted probability distributions and the one-hot encoded target tokens. The engineer observes that the two plots are identical. Which of the following statements provides the most accurate mathematical justification for this observation?

Learn Before

Equivalence of Maximizing Auto-regressive Log-Likelihood and Minimizing Cross-Entropy Loss

Multiple Choice

A machine learning engineer is training a language model on a text corpus. During training, they plot two values at each step:

The average negative log-likelihood of the target sequences.
The cross-entropy loss between the model's predicted probability distributions and the one-hot encoded target tokens.

The engineer observes that the two plots are identical. Which of the following statements provides the most accurate mathematical justification for this observation?

Updated 2025-09-28

Contributors are:

Who are from:

Learn Before

Related