1Cademy - A model for Named Entity Recognition is being trained. During one step, it processes a sentence and produces the probability distributions below for two of the words. The training process aims to adjust the models parameters by calculating a loss based on the predicted probability of the correct, ground-truth tag for each word. **Word: Anya** (Ground-truth tag: `I-PER`) * `B-PER`: 0.05 * `I-PER`: 0.85 * `O`: 0.10 **Word: Berlin** (Ground-truth tag: `B-LOC`) * `B-LOC`: 0.10 * `B-ORG`: 0.45 * `O`: 0.45 Based on this information, which words prediction will contribute a larger value to the overall training loss for this step, and why?

Learn Before

Training BERT-based NER Models

Multiple Choice

A model for Named Entity Recognition is being trained. During one step, it processes a sentence and produces the probability distributions below for two of the words. The training process aims to adjust the model's parameters by calculating a loss based on the predicted probability of the correct, ground-truth tag for each word.

Word: 'Anya' (Ground-truth tag: I-PER)