1Cademy - Applying Scaled Dot-Product Attention

Learn Before

Single-Query Attention Computation with Multiplicative Scaling

Case Study

Applying Scaled Dot-Product Attention

You are given a single query vector, a key matrix, a value matrix, and the vector dimension d. Your task is to compute the final attention output vector by applying the scaled dot-product attention formula. Show the intermediate results for each major step.

Updated 2025-10-08

Contributors are: