Learn Before
Concept

Logarithmic Bucketing for Larger T5 Offsets

Within the T5 relative bias framework, relative position offsets that exceed the one-to-one mapping threshold are grouped into buckets that grow logarithmically in size. Specifically, for the remaining buckets indexed from nb+12\frac{n_b + 1}{2} up to nbn_b, each bucket encompasses a logarithmically increasing range of offsets. This bucketing strategy enables the architecture to handle extensive sequences by generalizing to larger distances without dedicating a unique parameter to every single offset.

Image 0

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences