1Cademy - Applying Prediction Networks to Context Token Outputs

Learn Before

BERT-based Architecture for Span Prediction

Concept

Applying Prediction Networks to Context Token Outputs

In a span prediction architecture, the prediction networks for determining the start and end of an answer span are applied exclusively to the final output embeddings corresponding to the context tokens. These embeddings, which can be represented as a sequence em, em+1, ..., elen-1, contain the contextualized information needed to calculate the start ( $p_j^{\text{beg}}$ ) and end ( $p_j^{\text{end}}$ ) probabilities for each position within the context. The outputs corresponding to the query tokens are not used in this prediction step.