Concept

Applying Prediction Networks to Context Token Outputs

In a span prediction architecture, the prediction networks for determining the start and end of an answer span are applied exclusively to the final output embeddings corresponding to the context tokens. These embeddings, which can be represented as a sequence em, em+1, ..., elen-1, contain the contextualized information needed to calculate the start (pjbegp_j^{\text{beg}}) and end (pjendp_j^{\text{end}}) probabilities for each position within the context. The outputs corresponding to the query tokens are not used in this prediction step.

0

1

Updated 2026-04-18

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models