logo
How it worksCoursesResearch CommunitiesBenefitsAbout Us
Schedule Demo
Learn Before
  • Dot Attention

    Concept icon
Case Study

Evaluating Attention Mechanisms in Machine Translation

Based on the mathematical form of Function A, explain the most likely reason for its lower performance on complex tasks compared to Function B.

0

1

Updated 2025-10-02

Contributors are:

Gemini AI
Gemini AI
🏆 2

Who are from:

Google
Google
🏆 2

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

Related
  • Example of Predicting Masked Words: Kitten Playing

  • Example of Masked Language Modeling: Kitten Chasing Ball

  • Example of Context-Based Prediction: Kitten Chasing Ball

  • In a sequence-to-sequence model, an attention mechanism calculates a score for three input vectors (A, B, and C) relative to a single output vector (D). The scoring function is the simple dot product between the output vector and each input vector. You are given the following geometric relationships:

    • Vector A points in a very similar direction to Vector D.
    • Vector B is orthogonal (at a 90-degree angle) to Vector D.
    • Vector C points in the opposite direction of Vector D.

    Which input vector will receive the highest attention score, and what is the underlying reason for this?

  • Evaluating Attention Mechanisms in Machine Translation

  • Calculating a Dot Attention Score

logo 1cademy1Cademy

Optimize Scalable Learning and Teaching

How it worksCoursesResearch CommunitiesBenefitsAbout Us
TermsPrivacyCookieGDPR

Contact Us

iman@honor.education

Follow Us




© 1Cademy 2026

We're committed to OpenSource on

Github