Invariance in Rotational Position Encodings
A language model uses a rotational scheme to encode token positions. An analyst observes that the inner product between the encoded representations of the word 'apple' at position 4 and 'banana' at position 9 is identical to the inner product between the encoded representations of 'apple' at position 21 and 'banana' at position 26. Explain the mathematical reason for this observation.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Consider two token embeddings, x and y, encoded with a rotational position scheme at positions 10 and 7, respectively. Their resulting inner product is calculated. If these same two tokens are instead placed at positions 25 and 22, how would the new inner product compare to the original one?
Invariance in Rotational Position Encodings
Evaluating a Positional Encoding Implementation