In a relative position encoding system where the bucket index b for a small, non-negative offset i - j is determined by the identity function b(i - j) = i - j, it is true that for every unit increase in the offset, the corresponding bucket index also increases by exactly one unit.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a relative position encoding scheme, a bias is determined by assigning the interaction between a query at position
iand a key at positionjto a specific bucket. For a certain range of small, non-negative offsets, this assignment uses a direct one-to-one correspondence, where the bucket index is simply the calculated offseti - j. Given a query at positioni=7and a key at positionj=3, which bucket index would be assigned?Calculating Key Position from Bucket Index
In a relative position encoding system where the bucket index
bfor a small, non-negative offseti - jis determined by the identity functionb(i - j) = i - j, it is true that for every unit increase in the offset, the corresponding bucket index also increases by exactly one unit.