Learn Before
Example

Example of T5 Bias Bucketing

The T5 bias mechanism uses a combination of fixed and logarithmically scaled buckets to group relative position offsets. For example, the first 16 buckets (0-15) have a fixed size and map one-to-one with their corresponding offsets. For larger distances, the bucket sizes increase logarithmically: bucket 16 covers offsets from 16 to 20, bucket 17 covers offsets from 21 to 26, and bucket 18 covers offsets from 27 to 33. This pattern continues until a final bucket, such as bucket 32, consolidates all offsets beyond a certain threshold (e.g., 802 to infinity).

Image 0

0

1

Updated 2025-10-09

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Learn After