1Cademy - Positional Invariance in Self-Attention

Learn Before

Relative Positional Representations

Short Answer

Positional Invariance in Self-Attention

Consider two identical phrases, 'the quick brown fox', appearing at the beginning of one document and in the middle of another. A self-attention mechanism is processing the relationship between the words 'quick' and 'fox' in both instances. Explain why a model using relative positional representations would compute a more consistent attention score for this word pair across the two documents compared to a model using representations based on a token's fixed index from the start of the sequence.

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related