Designing a Span Prediction Module
Based on the provided scenario, describe the architecture of the components the engineer should add on top of the pre-trained model to pinpoint the answer span. What would each component do, and what specific information would it use as input?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A researcher is fine-tuning a model for a question-answering task. The model processes a question and a context paragraph to predict the start and end positions of the answer within the paragraph. After training, the researcher observes a specific performance issue: the model consistently identifies the correct end token of the answer span, but frequently selects an incorrect start token. Based on the typical architecture for this task where separate predictions are made for the start and end points, which component is the most likely source of this specific error pattern?
A model is designed to extract a specific span of text (the answer) from a larger context paragraph based on a given question. Arrange the following steps in the correct logical order that describes how this model processes the information to identify the answer.
Designing a Span Prediction Module