Model Adaptation for Similarity Scoring
A team is adapting a pre-trained transformer model, originally built for text classification, to calculate a similarity score between pairs of sentences. The goal is to have the model output a single value between 0 (completely different) and 1 (identical). After training, they observe that the model's output is a single, unbounded number (e.g., -12.7, 0.5, 23.9), which is not the desired format. What specific component is most likely missing from the final layer of their model architecture, and why is this component necessary to achieve the desired output?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Model Adaptation for Similarity Scoring
A developer is building a system to identify duplicate questions on a Q&A website. The system needs to take two questions as input and output a single similarity score ranging from 0.0 (completely different) to 1.0 (identical). The developer plans to use a pre-trained transformer-based model. Which modification to the model's final output layer is the most suitable for this regression task?
Designing a Semantic Search Component