1Cademy - A decoder-only model is preparing the input sequence The quick brown fox for processing. To create the initial input representation for the token brown (at position 2), the model retrieves its token embedding vector, `V_brown`, and the positional embedding vector for position 2, `P_2`. Which of the following correctly describes the operation used to combine these two vectors into the final representation that is fed into the first layer of the model?

Learn Before

Initial Input Representation for Transformer Layers

Multiple Choice

A decoder-only model is preparing the input sequence 'The quick brown fox' for processing. To create the initial input representation for the token 'brown' (at position 2), the model retrieves its token embedding vector, V_brown, and the positional embedding vector for position 2, P_2. Which of the following correctly describes the operation used to combine these two vectors into the final representation that is fed into the first layer of the model?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related