Case Study

Debugging Sentence Pair Representations

A language model is designed to process sentence pairs. The input representation for each token is generated by summing three distinct vectors: one for the token's identity, one for its position within its sentence, and one indicating whether it belongs to the first or second sentence. The model is given the input: (Sentence 1: 'The team won the game.') (Sentence 2: 'The team lost the game.'). Upon inspection, an engineer discovers that the final input vector for the token 'team' in Sentence 1 is identical to the final input vector for the token 'team' in Sentence 2. Given this information, which of the three component vectors is most likely being implemented incorrectly, and why is this a problem?

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science