Definition

Embedding Size in Transformer Models

In Transformer models, the embedding size, denoted as ded_e, defines the dimensionality of the real-valued vectors used to represent each token. As such, the final input vector for any given token is a ded_e-dimensional real-valued vector. This vector is formed by summing its constituent parts—the token embedding, positional embedding, and segment embedding—each of which is independently a ded_e-dimensional real-valued vector.

0

1

Updated 2026-04-17

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related
Learn After