Multiple Choice

A Named Entity Recognition (NER) model processes the phrase 'Washington Post'. For each word, it calculates a score for the most plausible tags, as shown below:

WordTagScore
WashingtonB-PER0.9
WashingtonB-ORG0.8
PostO0.7
PostI-ORG0.6

The model has also learned from its training data that a 'B-ORG' tag is very likely to be followed by an 'I-ORG' tag. A simple 'greedy' approach, which picks the highest-scoring tag for each word independently, would output the sequence: [B-PER, O]. However, an optimal decoding algorithm that also considers the likelihood of tag-to-tag transitions would output the correct sequence: [B-ORG, I-ORG]. What fundamental principle of finding the best label sequence does this example illustrate?

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science