1Cademy - Diagnosing a Language Models Performance Issue

Learn Before

Formula for Attention Weight with Relative Positional Encoding

Case Study

Diagnosing a Language Model's Performance Issue

A language model is being developed for a machine translation task. During testing, it is observed that while the model generates grammatically plausible sentences, it frequently scrambles the word order, leading to nonsensical translations, especially for longer input sentences. The engineering team has confirmed that the attention weights are being calculated using the formula shown in the case details. Based on an analysis of this formula, identify the most probable cause for the model's failure to preserve correct word order and explain your reasoning.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Learn Before

Related