Learn Before
Formula

Kerple Positional Bias Formula

The Kerple method for positional bias in transformer models utilizes a power-law function, as indicated by the term '(power)' in its representation. The bias between a query at position ii and a key at position jj is calculated with the formula: β1(ij)β2-\beta_1(i-j)^{\beta_2} In this formula, β1\beta_1 and β2\beta_2 are hyperparameters that define the scale and exponent of the power-law penalty, respectively.

Image 0

0

1

Updated 2026-04-24

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course