1Cademy - Dot Product Attention

Learn Before

Concept

Dot Product Attention

Dot product attention is a fundamental type of multiplicative attention based on dot similarity. Geometrically, the dot product measures the alignment between vectors: if a query and key share a similar direction, their dot product is higher, whereas orthogonal vectors yield a dot product of $0$ . This implies that keys which are more conceptually related to the current query will receive larger attention scores. One notable characteristic of pure dot product attention is that it does not introduce any additional learnable parameters, relying entirely on the existing vector representations. The attention score is calculated mathematically using the standard query-key notation as: $a(\mathbf{q}, \mathbf{k}_i) = \mathbf{q}^ op \mathbf{k}_i$ where $\mathbf{q}$ represents the query vector and $\mathbf{k}_i$ represents the key vector.