1Cademy - A language model is processing a single training example for a question-answering task. The correct answer span begins at token 25 and ends at token 28. The model predicts the probability of token 25 being the start as 0.6, and the probability of token 28 being the end as 0.7. Using the standard loss calculation for this task, which sums the negative log-likelihoods of the correct start and end positions (`Loss = - (log p_start + log p_end)`), what is the loss value for this example? (Use the na

Learn Before

Span Prediction Loss Formula

Multiple Choice

A language model is processing a single training example for a question-answering task. The correct answer span begins at token 25 and ends at token 28. The model predicts the probability of token 25 being the start as 0.6, and the probability of token 28 being the end as 0.7. Using the standard loss calculation for this task, which sums the negative log-likelihoods of the correct start and end positions (Loss = - (log p_start + log p_end)), what is the loss value for this example? (Use the na

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related