Calculating Key Position from Bucket Index
In a relative position encoding system, the interaction between a query at position i=12 and a key at position j is assigned to bucket 5. This system uses a direct one-to-one mapping for this range, where the bucket index is identical to the relative position offset (i - j). What is the position j of the key?
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a relative position encoding scheme, a bias is determined by assigning the interaction between a query at position
iand a key at positionjto a specific bucket. For a certain range of small, non-negative offsets, this assignment uses a direct one-to-one correspondence, where the bucket index is simply the calculated offseti - j. Given a query at positioni=7and a key at positionj=3, which bucket index would be assigned?Calculating Key Position from Bucket Index
In a relative position encoding system where the bucket index
bfor a small, non-negative offseti - jis determined by the identity functionb(i - j) = i - j, it is true that for every unit increase in the offset, the corresponding bucket index also increases by exactly one unit.