1Cademy - Comparing Model Predictions via Loss

Learn Before

Loss Function for Predicted vs. Gold Probability Distributions

Short Answer

Comparing Model Predictions via Loss

A language model is tasked with predicting the next word in the sentence 'The cat sat on the ___'. The correct next word is 'mat'. Two different models, Model A and Model B, produce the following probability distributions for the next word:

Model A: P('mat') = 0.8, P('rug') = 0.1, P('floor') = 0.05, P('chair') = 0.05
Model B: P('mat') = 0.2, P('rug') = 0.3, P('floor') = 0.3, P('chair') = 0.2

Based on the principle of minimizing the difference between the predicted and the true probability distribution, which model is performing better on this specific example? Explain your reasoning.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related