1Cademy - Analyzing Span Prediction Model Loss

Learn Before

Span Prediction Loss Function

Case Study

Analyzing Span Prediction Model Loss

A question-answering model is trained to identify answer spans in a text. Its training objective is to minimize a loss value calculated from two separate predictions for each token in the text: the probability of it being the start of the answer and the probability of it being the end. The total loss is the sum of the negative log-likelihoods from both prediction networks.

Consider two different training instances:

Instance 1: The model assigns a very high probability to the correct start token and the correct end token.
Instance 2: The model assigns a very low probability to the correct start token and the correct end token.

Which instance will result in a significantly higher loss value during training, and why is this the case?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related