Short Answer

Model Adaptation for Similarity Scoring

A team is adapting a pre-trained transformer model, originally built for text classification, to calculate a similarity score between pairs of sentences. The goal is to have the model output a single value between 0 (completely different) and 1 (identical). After training, they observe that the model's output is a single, unbounded number (e.g., -12.7, 0.5, 23.9), which is not the desired format. What specific component is most likely missing from the final layer of their model architecture, and why is this component necessary to achieve the desired output?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science