Concept

Dot Product Attention

Dot product attention is a fundamental type of multiplicative attention based on dot similarity. Geometrically, the dot product measures the alignment between vectors: if a query and key share a similar direction, their dot product is higher, whereas orthogonal vectors yield a dot product of 00. This implies that keys which are more conceptually related to the current query will receive larger attention scores. One notable characteristic of pure dot product attention is that it does not introduce any additional learnable parameters, relying entirely on the existing vector representations. The attention score is calculated mathematically using the standard query-key notation as: a(q,ki)=qopkia(\mathbf{q}, \mathbf{k}_i) = \mathbf{q}^ op \mathbf{k}_i where q\mathbf{q} represents the query vector and ki\mathbf{k}_i represents the key vector.

0

1

Updated 2026-05-14

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

D2L

Dive into Deep Learning @ D2L