1Cademy - Tuning the ALiBi Bias Scalar ($$\beta$$)

Learn Before

Complete ALiBi Attention Formula

Activity (Process)

Tuning the ALiBi Bias Scalar ( $\beta$ )

In the ALiBi framework, the scalar hyperparameter $\beta$ , which dictates the magnitude of the positional penalty applied to query-key products, is generally optimized by tuning its value on a validation dataset to discover the most effective setting for a specific task.

Updated 2026-04-24

Contributors are: