1Cademy - Kerple Power-Law Positional Bias Formula

Learn Before

Kerple

Formula

Kerple Power-Law Positional Bias Formula

The Kerple method for positional bias in transformer models utilizes a power-law function, as indicated by the term '(power)' in its representation. The bias between a query at position $i$ and a key at position $j$ is calculated with the formula: $-\beta_1(i-j)^{\beta_2}$ In this formula, $\beta_1$ and $\beta_2$ are hyperparameters that define the scale and exponent of the power-law penalty, respectively.

Updated 2026-06-19

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course

Learn Before

Related