1Cademy - A researcher is implementing a positional bias mechanism for a multi-layer transformer model, as introduced by Chi et al. in 2023. The goal is to influence the attention scores based on the relative positions of tokens. Given the specific design of this method, which of the following implementation strategies is correct?

Learn Before

Sandwich Method (Chi et al., 2023)

Multiple Choice

A researcher is implementing a positional bias mechanism for a multi-layer transformer model, as introduced by Chi et al. in 2023. The goal is to influence the attention scores based on the relative positions of tokens. Given the specific design of this method, which of the following implementation strategies is correct?

Updated 2025-10-04

Contributors are:

Who are from:

Learn Before

Related