Learn Before
  • Data Point as a d-dimensional Vector

Combining Token and Positional Embeddings

In sequence models like the Transformer, the final input representation for a token is created by summing its semantic embedding with an embedding that encodes its position. The formula is ei=xi+PE(i)\mathbf{e}_i = \mathbf{x}_i + \mathbf{PE}(i), where xi\mathbf{x}_i is the token embedding vector for the ii-th token, PE(i)\mathbf{PE}(i) is the positional encoding vector for position ii, and ei\mathbf{e}_i is the resulting combined vector. This process allows the model to use information about the order of tokens.

Image 0

0

1

10 days ago

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related
  • Combining Token and Positional Embeddings

  • Positional Encoding Vector