Learn Before
Applying Prediction Networks to Context Token Outputs
In a span prediction architecture, the prediction networks for determining the start and end of an answer span are applied exclusively to the final output embeddings corresponding to the context tokens. These embeddings, which can be represented as a sequence em, em+1, ..., elen-1, contain the contextualized information needed to calculate the start () and end () probabilities for each position within the context. The outputs corresponding to the query tokens are not used in this prediction step.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.1 Pre-training - Foundations of Large Language Models
Related
Span Prediction Loss Function
Inference for Span Prediction
Illustration of BERT-based Architecture for Span Prediction
Input Sequence Formatting for Span Prediction
Applying Prediction Networks to Context Token Outputs
An engineer is designing a model to extract answers from a paragraph. The model must identify a continuous segment of text (a 'span') that answers a given question. The model's base component processes the input and produces a contextualized vector representation for each token in the paragraph. Considering the task is to identify the start and end points of the answer, which of the following architectural designs for the final prediction layer is most appropriate?
Debugging a Question-Answering Model Architecture
Comparing Model Architectures for Text Extraction Tasks
Learn After
A question-answering model is given a query and a context passage. It processes the combined text and generates a final contextualized embedding for every token. To identify the specific text span within the passage that answers the query, the model must calculate start and end probabilities for each potential token. Which set of embeddings should be used as input to the prediction networks that perform this calculation?
Debugging a Span Prediction Model
In a span prediction model designed for question answering, after the entire input (query + context) has been processed to generate contextualized token embeddings, the prediction networks for the answer's start and end positions must evaluate the embeddings for all tokens in the original input sequence.