1Cademy - Example of T5 Bias Bucketing

Learn Before

Synthesis of T5 Bias Bucketing Rules

Example

Example of T5 Bias Bucketing

The T5 bias mechanism uses a combination of fixed and logarithmically scaled buckets to group relative position offsets. For example, the first 16 buckets (0-15) have a fixed size and map one-to-one with their corresponding offsets. For larger distances, the bucket sizes increase logarithmically: bucket 16 covers offsets from 16 to 20, bucket 17 covers offsets from 21 to 26, and bucket 18 covers offsets from 27 to 33. This pattern continues until a final bucket, such as bucket 32, consolidates all offsets beyond a certain threshold (e.g., 802 to infinity).