1Cademy - Implicit Relative Position Modeling in Self-Attention with RoPE

Learn Before

Dot Product of RoPE-Encoded Vectors as a Function of Relative Position
Query, Key, and Value in Attention Mechanisms

Concept

Implicit Relative Position Modeling in Self-Attention with RoPE

When Rotary Positional Embeddings (RoPE) are applied to query and key vectors, the self-attention mechanism inherently captures relative positional context. Specifically, if the RoPE-encoded vectors Ro(x, tθ) and Ro(y, sθ) are treated as the query and key respectively, the self-attention operation implicitly models the relative positions. This is because their dot product is a function of the relative displacement, $t-s$ .