1Cademy - An attention mechanism uses a linear relative position bias to penalize distant key-value pairs. In a causal setting, a query at a given position attends to itself and all previous positions up to a certain maximum distance. Match each maximum relative distance to the corresponding set of bias values that would be applied, where β is a scalar.

Learn Before

Example of Linear Relative Position Bias Values in Causal Attention

Matching

An attention mechanism uses a linear relative position bias to penalize distant key-value pairs. In a causal setting, a query at a given position attends to itself and all previous positions up to a certain maximum distance. Match each maximum relative distance to the corresponding set of bias values that would be applied, where β is a scalar.

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related