Learn Before
Concept

Positional Encoding

Unlike Recurrent Neural Networks (RNNs) that process tokens sequentially one-by-one, self-attention ditches sequential operations in favor of parallel computation and does not naturally preserve the order of the input sequence. To address this order-insensitivity, the dominant approach is to represent the sequence order as an additional input associated with each token, called a positional encoding. These encodings can be either learned during training or fixed a priori.

0

1

Updated 2026-05-14

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

D2L

Dive into Deep Learning @ D2L