1Cademy - Shared Learnable Bias per Offset

Learn Before

Relative Positional Encoding as a Query-Key Bias
Relative Position Offset Calculation

Concept

Shared Learnable Bias per Offset

A basic architecture for a relative positional bias assigns a singular, shared learnable parameter to each distinct query-key distance. Under this framework, the bias value for any query $\mathbf{q}_i$ and key $\mathbf{k}_j$ relies entirely on their offset, $i - j$ . Consequently, all pairs sharing this identical offset are mapped to the same variable, $u_{i-j}$ , resulting in the mathematical relationship: $\mathrm{PE}(i, j) = u_{i-j}$ .