Learn Before
Span Prediction in NLP
Span prediction is an NLP task focused on identifying a continuous segment of text. A prominent example is reading comprehension, where a model is given a query (e.g., ) and a context passage (e.g., ) and must locate the span within the context that answers the query. This task can be conceptualized as a form of sequence labeling, where the objective is to predict for each token in the context whether it marks the beginning or the end of the desired span.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.1 Pre-training - Foundations of Large Language Models
Related
Part-of-Speech (POS) Tagging
BERT-based Architecture for Sequence Labeling
Span Prediction in NLP
Definition of Named Entity Recognition
A model is designed to perform a sequence labeling task by identifying organizations and locations within a text. For each word (token), it must assign one of the following labels:
O(not an entity),B-ORG(beginning of an organization),I-ORG(inside an organization),B-LOC(beginning of a location), orI-LOC(inside a location). Given the sentence 'The United Nations headquarters in New York City is a major landmark', which of the following represents the correct sequence of labels?Applicability of Sequence Labeling
Analyzing a Sequence Labeling Model's Output
Negative Likelihood Loss in Sequence Labeling
Learn After
A language model is tasked with answering a question by identifying the correct text span from a given context. The model works by calculating a probability for each token being the 'start' of the answer and a separate probability for each token being the 'end' of the answer. Consider the following scenario:
Context: 'The first modern Olympic Games were held in Athens, Greece, in 1896. The International Olympic Committee (IOC) was founded in 1894 by Pierre de Coubertin.' Question: 'When was the IOC established?'
The model produces the following highest probabilities:
- Highest Probability Start Token: '1896' (Probability: 0.85)
- Highest Probability End Token: '1894' (Probability: 0.91)
Based on this output, what is the most fundamental reason the model failed to produce a valid answer span?
Framing a Clinical Information Extraction Task
Applicability of Span Prediction
BERT-based Architecture for Span Prediction