Learn Before
Comparing Model Architectures for Text Extraction Tasks
Imagine you are tasked with building two different systems. System A must identify a continuous phrase (e.g., a person's name, a date) within a sentence that answers a specific question. System B must classify every single word in a sentence into predefined categories (e.g., Person, Location, Organization, or Other). Both systems will use the same foundational language model that produces a contextualized vector for each input word.
Compare and contrast the design of the final prediction layers you would build on top of the foundational model for System A versus System B. Explain why the different task requirements necessitate these different architectural choices.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Span Prediction Loss Function
Inference for Span Prediction
Illustration of BERT-based Architecture for Span Prediction
Input Sequence Formatting for Span Prediction
Applying Prediction Networks to Context Token Outputs
An engineer is designing a model to extract answers from a paragraph. The model must identify a continuous segment of text (a 'span') that answers a given question. The model's base component processes the input and produces a contextualized vector representation for each token in the paragraph. Considering the task is to identify the start and end points of the answer, which of the following architectural designs for the final prediction layer is most appropriate?
Debugging a Question-Answering Model Architecture
Comparing Model Architectures for Text Extraction Tasks