Learn Before
Debugging Sentence Pair Representations
A language model is designed to process sentence pairs. The input representation for each token is generated by summing three distinct vectors: one for the token's identity, one for its position within its sentence, and one indicating whether it belongs to the first or second sentence. The model is given the input: (Sentence 1: 'The team won the game.') (Sentence 2: 'The team lost the game.'). Upon inspection, an engineer discovers that the final input vector for the token 'team' in Sentence 1 is identical to the final input vector for the token 'team' in Sentence 2. Given this information, which of the three component vectors is most likely being implemented incorrectly, and why is this a problem?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A researcher is debugging a language model where the input representation for each token is created by summing three distinct vectors: one for the token's identity, one for its position in the sequence, and one for the sentence segment it belongs to. The researcher observes that the model treats the sentences 'The scientist observed the star' and 'The star observed the scientist' as having identical meanings. Which of the three component vectors is most likely being calculated incorrectly or omitted, causing this specific error?
In a language model that uses separate vectors for token identity, position, and sentence membership, the final input vector for a token is created by concatenating these three component vectors end-to-end.
Debugging Sentence Pair Representations
Segment Embedding
Example of Input Embedding Composition for a Sentence Pair