Multiple Choice

A model for Named Entity Recognition is being trained. During one step, it processes a sentence and produces the probability distributions below for two of the words. The training process aims to adjust the model's parameters by calculating a loss based on the predicted probability of the correct, ground-truth tag for each word.

Word: 'Anya' (Ground-truth tag: I-PER)

  • B-PER: 0.05
  • I-PER: 0.85
  • O: 0.10

Word: 'Berlin' (Ground-truth tag: B-LOC)

  • B-LOC: 0.10
  • B-ORG: 0.45
  • O: 0.45

Based on this information, which word's prediction will contribute a larger value to the overall training loss for this step, and why?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science