Short Answer

Calculating a Scaled Attention Score

In a transformer's self-attention layer, a query vector q = [2, 0, 1, 3] is being compared to a key vector k = [1, 2, 2, 1]. The dimensionality of these vectors (d_k) is 4. Based on the standard scaled dot-product attention mechanism, what is the resulting unnormalized attention score? Provide the final numerical answer.

0

1

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Data Science

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science