Definition

dist_max Parameter in T5 Bias

In the T5 relative position bucketing system, the parameter distmax\mathrm{dist}_{\mathrm{max}} is typically assigned a relatively large numerical value. It serves to define the maximum offset, or relative distance, that the model is expected to encounter between token positions.

0

1

Updated 2026-04-23

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences