1Cademy - Impact of ALiBi Bias Scalar on Model Performance

Learn Before

Tuning the ALiBi Bias Scalar ( $\beta$ )

Case Study

Impact of ALiBi Bias Scalar on Model Performance

A research team is fine-tuning a language model for a text summarization task. The model's attention mechanism includes a bias term, β * (j - i), added to the attention scores, where (j - i) is the distance between tokens. The team trains two versions of the model with different settings for the scalar β and observes distinct behaviors on the validation set. Analyze the likely cause of each model's performance issues.

Updated 2025-10-05

Contributors are:

Who are from:

β Value	ROUGE Score
0.01	0.35
0.1	0.4

Learn Before

Related