Formula

Span Prediction Inference Formula

During inference for a span prediction task, the optimal answer span, represented by the start index j^1\hat{j}_1 and end index j^2\hat{j}_2, is found by selecting the pair of indices that maximizes the sum of the log-probabilities for the start and end positions. The search is constrained such that the start index must not come after the end index. The formula is: (j^1,j^2)=argmax1j1j2n(logpj1beg+logpj2end)(\hat{j}_1, \hat{j}_2) = \operatorname*{argmax}_{1 \le j_1 \le j_2 \le n} \big( \log p_{j_1}^{\mathrm{beg}} + \log p_{j_2}^{\mathrm{end}} \big) Where: - pj1begp_{j_1}^{\mathrm{beg}} is the probability that token j1j_1 is the start of the span. - pj2endp_{j_2}^{\mathrm{end}} is the probability that token j2j_2 is the end of the span. - nn is the number of tokens in the context.

Image 0

0

1

Updated 2026-04-18

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models