Multiple Choice

A language model is given two token sequences: sequence x with 10 tokens (at positions 0 through 9) and sequence y with 5 tokens. To process them together, they are concatenated into a single sequence [x, y]. How is the initial input vector for the very first token of the original sequence y calculated before being passed to the first processing layer?

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science