Formula

Geometric Progression Formula for ALiBi's β Scalar per Head

When employing a geometric progression to determine the scalar bias (β) for each head in an ALiBi (Attention with Linear Biases) mechanism, the specific value for the k-th head is calculated using the formula: βk=128k\beta_k = \frac{1}{2^{\frac{8}{k}}}

Image 0

0

1

Updated 2026-04-24

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course