Designing a Positional Bias for a Specific Task
Imagine you are designing a self-attention model for a task that requires focusing primarily on the relationships between words that are very close to each other in a sentence. How would you structure the positional bias term, which is added to the attention score between position i and position j, to encourage this behavior? Describe the relationship between the bias value and the distance between the two positions (|i - j|).
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a self-attention mechanism, a bias term is added to the score calculated between a query at position
iand a key at positionj. Consider a scenario where this bias term is designed to be a large negative value when the distance|i - j|is large, and it approaches zero as the distance gets smaller. How would this specific design influence the model's behavior?Diagnosing Model Behavior via Positional Bias
Designing a Positional Bias for a Specific Task