Concept

Implicit Relative Position Modeling in Self-Attention with RoPE

When Rotary Positional Embeddings (RoPE) are applied to query and key vectors, the self-attention mechanism inherently captures relative positional context. Specifically, if the RoPE-encoded vectors Ro(x, tθ) and Ro(y, sθ) are treated as the query and key respectively, the self-attention operation implicitly models the relative positions. This is because their dot product is a function of the relative displacement, tst-s.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related