Learn Before
Formula

Kerple Logarithmic Bias Formula

The Kerple method for positional bias can be implemented using a logarithmic function to penalize attention based on token distance. For a query at position ii and a key at position jj, the bias is calculated with the formula: PE(i,j)=β1log(1+β2(ij))\mathrm{PE}(i,j) = -\beta_1 \log(1 + \beta_2(i - j)). Here, β1\beta_1 and β2\beta_2 are hyperparameters controlling the scale and shape of the logarithmic penalty.

0

1

Updated 2026-04-24

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course