1Cademy - A decoder-only Transformer model is given a sequence of tokens as input. Arrange the following steps in the correct chronological order to describe how the model creates the initial representation that is fed into its first layer.

Learn Before

Initial Input Representation for Transformer Layers

Sequence Ordering

A decoder-only Transformer model is given a sequence of tokens as input. Arrange the following steps in the correct chronological order to describe how the model creates the initial representation that is fed into its first layer.

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.5 Inference - Foundations of Large Language Models

Comprehension in Revised Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Layer-wise Processing in Transformer Inference
Initial Representation for Concatenated [x, y] Sequences
Calculating an Initial Input Vector
A decoder-only model is preparing the input sequence 'The quick brown fox' for processing. To create the initial input representation for the token 'brown' (at position 2), the model retrieves its token embedding vector, V_brown, and the positional embedding vector for position 2, P_2. Which of the following correctly describes the operation used to combine these two vectors into the final representation that is fed into the first layer of the model?
A decoder-only Transformer model is given a sequence of tokens as input. Arrange the following steps in the correct chronological order to describe how the model creates the initial representation that is fed into its first layer.
Input Representation for a Single Token in Autoregressive Generation

Learn Before

Related