Short Answer

Interpreting a Linear Positional Bias Value

In a model that uses a linear positional bias, the bias added to an attention score is calculated with the formula Bias = β ⋅ (j - i), where i is the query position, j is the key position, and β is a positive scalar. If the calculated bias for a particular query-key pair is -5β, what is the relative distance between the query and the key, and which token appears earlier in the sequence?

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science