1Cademy - Initial Representation for Concatenated [x, y] Sequences

Learn Before

Initial Input Representation for Transformer Layers
Calculating Conditional Log-Probability Using an LLM

Activity (Process)

Initial Representation for Concatenated [x, y] Sequences

For a concatenated sequence [x, y], the initial input representation for the Transformer stack is generated on a per-token basis. For each position i' in the sequence, the token is converted into an embedding vector. This vector becomes the initial representation for that specific position and is then fed into the first Transformer layer.