Multiple Choice

In a transformer model that uses a relative position bias mechanism, a specific set of initial 'buckets' is used to store shared bias parameters. For small, non-negative relative distances between a query and a key, there is a direct correspondence where the bucket index is identical to the distance. If a query is at position 8 and a key is at position 5, what is the index of the bucket used for their interaction?

0

1

Updated 2025-10-01

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science