1Cademy - Structure of a Transformer Layers Input

Learn Before

Input Representation in a Transformer Layer

Short Answer

Structure of a Transformer Layer's Input

A large language model is processing the sentence 'The cat sat,' which has been tokenized into three distinct parts. Consider the input being prepared for the 5th layer of this model. Based on the standard representation for a Transformer layer's input, describe the structure and components of the matrix, denoted as $H_5$ , that would be fed into this layer. What does each primary component of this matrix represent?

Updated 2025-10-03

Contributors are:

Who are from:

Learn Before

Related