1Cademy - A neural network is trained on a 4-class classification task. For a single training example where the true class is the second class, the model outputs the probability vector `[0.1, 0.7, 0.1, 0.1]`. The loss for this example is calculated as `-log(0.7)`. This loss function can be interpreted as a measure of divergence between two probability distributions. What are these two distributions?

Learn Before

A Broad Definition of Cross Entropy

Multiple Choice

A neural network is trained on a 4-class classification task. For a single training example where the true class is the second class, the model outputs the probability vector [0.1, 0.7, 0.1, 0.1]. The loss for this example is calculated as -log(0.7). This loss function can be interpreted as a measure of divergence between two probability distributions. What are these two distributions?

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related