Learn Before
Concept

One-to-One Mapping for Initial T5 Bias Buckets

In the T5 relative positional encoding scheme, the initial range of buckets maintains a direct, one-to-one correspondence with the query-key offsets. Specifically, for buckets indexed from 0{}0 up to nb+121\frac{n_b + 1}{2} - 1, each bucket is assigned to a single unique offset (i.e., bucket 0{}0 matches offset 0{}0, bucket 1{}1 matches offset 1{}1, and so forth). This direct mapping is mathematically denoted by the function b(ij)=ijb(i - j) = i - j.

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences