Concept

Relative Positional Encoding as a Query-Key Bias

Rather than modifying the initial input token embeddings, an alternative self-attention architecture integrates positional awareness directly into the core interaction calculation. It achieves this by adding a relative positional bias term, represented as PE(i,j)\mathrm{PE}(i, j), directly to the query-key product, which structurally alters the attention score between position ii and position jj.

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences