1Cademy - Interaction of Semantic and Positional Scores

Learn Before

Formula for Attention Score with ALiBi Bias

Short Answer

Interaction of Semantic and Positional Scores

In a system that calculates a pre-Softmax attention score by adding a linear positional bias to the scaled query-key dot product, describe a scenario where a key that is semantically less similar to a query (i.e., has a lower dot-product score) could receive a higher final attention score than a key that is semantically more similar. Explain your reasoning by referencing the components of the calculation.

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related