1Cademy - Linear Relative Position Bias Example

Learn Before

ALiBi Bias Term Formula

Example

Linear Relative Position Bias Example

A linear relative position bias scheme incorporates sequence order into attention mechanisms by adding a penalty term, calculated as $-\beta(i-j)$ , to the query-key dot product. In this formula, $(i-j)$ is the relative distance between the query and key, and $\beta$ is a scalar, resulting in a penalty that grows linearly with distance. In a causal attention setting, where a query only attends to previous keys, the bias values for different maximum relative distances are as follows:

For relative distances of 3, 2, 1, and 0, the biases are: $-3\beta, -2\beta, -1\beta, 0$
For relative distances of 4, 3, 2, 1, and 0, the biases are: $-4\beta, -3\beta, -2\beta, -1\beta, 0$
For relative distances of 5, 4, 3, 2, 1, and 0, the biases are: $-5\beta, -4\beta, -3\beta, -2\beta, -1\beta, 0$
For relative distances of 6, 5, 4, 3, 2, 1, and 0, the biases are: $-6\beta, -5\beta, -4\beta, -3\beta, -2\beta, -1\beta, 0$