Learn Before
Inference for Span Prediction
During the inference phase, also known as test time, a search is conducted to identify the optimal answer span after a model has been trained. This process utilizes the start and end probabilities that the model has calculated for each token in the context to determine the most likely span.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Span Prediction Loss Function
Inference for Span Prediction
Illustration of BERT-based Architecture for Span Prediction
Input Sequence Formatting for Span Prediction
Applying Prediction Networks to Context Token Outputs
An engineer is designing a model to extract answers from a paragraph. The model must identify a continuous segment of text (a 'span') that answers a given question. The model's base component processes the input and produces a contextualized vector representation for each token in the paragraph. Considering the task is to identify the start and end points of the answer, which of the following architectural designs for the final prediction layer is most appropriate?
Debugging a Question-Answering Model Architecture
Comparing Model Architectures for Text Extraction Tasks
Learn After
Span Prediction Inference Formula
Identifying the Optimal Answer Span
A language model has processed the context 'The capital of France is Paris.' and produced the following probabilities for each token being the start or the end of an answer span. To determine the most likely answer, you must find the start and end token pair that yields the highest combined score (calculated as start_probability * end_probability), with the constraint that the start token cannot appear after the end token. Given the table below, which span is the most likely answer?
Token Index Start Probability End Probability 'The' 1 0.05 0.05 'capital' 2 0.10 0.05 'of' 3 0.05 0.05 'France' 4 0.20 0.10 'is' 5 0.05 0.05 'Paris' 6 0.50 0.60 '.' 7 0.05 0.10 Flaw in a Naive Inference Strategy