1Cademy - Analyzing the Span Prediction Constraint

Learn Before

Span Prediction Inference Formula

Short Answer

Analyzing the Span Prediction Constraint

A language model has generated start and end log-probabilities for a 5-token sequence. The highest start log-probability is for token 4, and the highest end log-probability is for token 2. If one were to naively select the span based on these two independent maximums, the resulting span would be invalid. Explain why this span is invalid and how the standard span prediction formula, $(\hat{j}_1, \hat{j}_2) = \underset{1 \le j_1 \le j_2 \le n}{\arg\max} (\log p_{j_1}^{\text{beg}} + \log p_{j_2}^{\text{end}})$ , prevents this specific issue.

Updated 2025-10-04

Contributors are:

Who are from:

Index	Token	Start Log-Prob	End Log-Prob
1	The	-5.1	-8.1
2	first	-4.2	-7.2
3	person	-4.5	-6.5
4	was	-5.5	-5.5
5	Neil	-0.9	-3.1
6	Armstrong	-2.1	-0.5

Learn Before

Related