Activity (Process)

Initial Representation for Concatenated [x, y] Sequences

For a concatenated sequence [x, y], the initial input representation for the Transformer stack is generated on a per-token basis. For each position i' in the sequence, the token is converted into an embedding vector. This vector becomes the initial representation for that specific position and is then fed into the first Transformer layer.

Image 0

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related