1Cademy - A Named Entity Recognition (NER) model processes the phrase Washington Post. For each word, it calculates a score for the most plausible tags, as shown below: | Word | Tag | Score | |------------|-------|-------| | Washington | B-PER | 0.9 | | Washington | B-ORG | 0.8 | | Post | O | 0.7 | | Post | I-ORG | 0.6 | The model has also learned from its training data that a B-ORG tag is very likely to be followed by an I-ORG tag. A simple greedy approach, which picks the highest-scoring tag for each word independently, would output the sequence: `[B-PER, O]`. However, an optimal decoding algorithm that also considers the likelihood of tag-to-tag transitions would output the correct sequence: `[B-ORG, I-ORG]`. What fundamental principle of finding the best label sequence does this example illustrate?

Learn Before

Finding the Optimal Label Sequence in NER

Multiple Choice

A Named Entity Recognition (NER) model processes the phrase 'Washington Post'. For each word, it calculates a score for the most plausible tags, as shown below:

Word	Tag	Score
Washington	B-PER	0.9
Washington	B-ORG	0.8
Post	O	0.7
Post	I-ORG	0.6

The model has also learned from its training data that a 'B-ORG' tag is very likely to be followed by an 'I-ORG' tag. A simple 'greedy' approach, which picks the highest-scoring tag for each word independently, would output the sequence: [B-PER, O]. However, an optimal decoding algorithm that also considers the likelihood of tag-to-tag transitions would output the correct sequence: [B-ORG, I-ORG]. What fundamental principle of finding the best label sequence does this example illustrate?

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related