1Cademy - A language model is given two token sequences: sequence `x` with 10 tokens (at positions 0 through 9) and sequence `y` with 5 tokens. To process them together, they are concatenated into a single sequence `[x, y]`. How is the initial input vector for the very first token of the original sequence `y` calculated before being passed to the first processing layer?

Learn Before

Initial Representation for Concatenated [x, y] Sequences

Multiple Choice

A language model is given two token sequences: sequence x with 10 tokens (at positions 0 through 9) and sequence y with 5 tokens. To process them together, they are concatenated into a single sequence [x, y]. How is the initial input vector for the very first token of the original sequence y calculated before being passed to the first processing layer?

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related