1Cademy - An engineer is building a translation model. The core of the model is a mechanism that, for each word, computes a new representation by taking a weighted sum of all other words in the sentence. The engineer observes that the model produces the exact same internal representation for the phrases the old mans car and the mans old car. What is the most probable reason for this behavior?

Learn Before

Self-Attention layer understanding - Step 5 - Adding the time

Multiple Choice

An engineer is building a translation model. The core of the model is a mechanism that, for each word, computes a new representation by taking a weighted sum of all other words in the sentence. The engineer observes that the model produces the exact same internal representation for the phrases 'the old man's car' and 'the man's old car'. What is the most probable reason for this behavior?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related