Positional Context in Concatenated Sequences
Consider two token sequences, x and y. A language model calculates two separate quantities: one by processing the sequence x followed by y (concatenated as [x, y]), and another by processing the sequence y by itself. Explain why the initial input vector for the first token of y will be different in these two scenarios. Your explanation should identify the specific component of the input vector that changes and the reason for this change.
0
1
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is given two token sequences: sequence
xwith 10 tokens (at positions 0 through 9) and sequenceywith 5 tokens. To process them together, they are concatenated into a single sequence[x, y]. How is the initial input vector for the very first token of the original sequenceycalculated before being passed to the first processing layer?Positional Context in Concatenated Sequences
Evaluating a Representation Generation Method