Formula

Formula for Attention Score with ALiBi Bias

The ALiBi method modifies the standard attention score by adding a positional bias term, PE(i,j)PE(i, j), directly to the scaled query-key dot product. This integration of the linear bias into the attention calculation results in the following formula for the pre-Softmax score: βi,j=qikjd+PE(i,j)\beta_{i,j} = \frac{\mathbf{q}_i \cdot \mathbf{k}_j}{\sqrt{d}} + PE(i, j)

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences