Learn Before
Multiple Choice

A model uses a hybrid strategy to handle relative positional distances between tokens, assigning each distance to one of a limited number of 'buckets'. The rules are:

  1. For small distances (e.g., 0-15), each distance is assigned to its own unique bucket.
  2. For medium distances, the ranges of distances assigned to a single bucket grow progressively larger as the distance increases.
  3. For very large distances (e.g., beyond 512), all are assigned to a single, final bucket.

Based on this system, which of the following distances is most likely to be assigned to the same bucket as the distance 40?

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science