Learn Before
Multiple Choice

In a sequence-to-sequence model, an attention mechanism calculates a score for three input vectors (A, B, and C) relative to a single output vector (D). The scoring function is the simple dot product between the output vector and each input vector. You are given the following geometric relationships:

  • Vector A points in a very similar direction to Vector D.
  • Vector B is orthogonal (at a 90-degree angle) to Vector D.
  • Vector C points in the opposite direction of Vector D.

Which input vector will receive the highest attention score, and what is the underlying reason for this?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science