Learn Before
Formula

Sandwich Positional Bias Formula

The Sandwich method calculates query-key positional bias as a sum of cosine functions based on the relative distance between a query at position ii and a key at position jj. The formula is defined as: PE(i,j)=k=1dˉ/2cos((ij)/100002k/dˉ)\mathrm{PE}(i,j) = \sum_{k=1}^{\bar{d}/2} \cos\big((i - j)/10000^{2k/\bar{d}} \big), where dˉ\bar{d} is a hyperparameter.

0

1

Updated 2026-04-24

Contributors are:

Who are from:

Tags

Foundations of Large Language Models

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences