Learn Before
Multiple Choice

A self-attention mechanism is designed so that the positional influence on the attention score between any two tokens depends only on their relative distance, not their absolute locations. For instance, the positional adjustment between the 3rd and 7th tokens is identical to the adjustment between the 23rd and 27th tokens. Which of the following techniques directly implements this principle?

0

1

Updated 2025-10-01

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science