Learn Before
An LLM architect is designing a self-attention mechanism where the positional influence between any two tokens is calculated directly as a bias in the attention score. The core design principle is that this bias must be determined by a specific, continuous mathematical function that takes only the relative distance between the tokens as its input. Which of the following implementation strategies directly realizes this design principle?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Computing Sciences
Foundations of Large Language Models Course
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
FIRE Positional Bias Formula
A self-attention mechanism is designed so that the positional influence on the attention score between any two tokens depends only on their relative distance, not their absolute locations. For instance, the positional adjustment between the 3rd and 7th tokens is identical to the adjustment between the 23rd and 27th tokens. Which of the following techniques directly implements this principle?
Analyzing the Functional Approach to Positional Bias
An LLM architect is designing a self-attention mechanism where the positional influence between any two tokens is calculated directly as a bias in the attention score. The core design principle is that this bias must be determined by a specific, continuous mathematical function that takes only the relative distance between the tokens as its input. Which of the following implementation strategies directly realizes this design principle?