1Cademy - Illustration of BERT-based Architecture for Named Entity Recognition

Learn Before

Example

Illustration of BERT-based Architecture for Named Entity Recognition

This diagram illustrates a common architecture for Named Entity Recognition (NER) using BERT, which is a direct application of the sequence labeling approach. An input sequence of tokens ( $x_1, x_2, ..., x_m$ ), prepended with a [CLS] token and appended with a [SEP] token, is converted into embeddings ( $e_i$ ) and fed into the BERT model. BERT processes the entire sequence and outputs a contextualized hidden state vector ( $h_i$ ) for each token. For the NER task, a separate classification layer is applied to each token's hidden state (e.g., $h_1$ through $h_m$ ) to predict a corresponding tag from a predefined set, such as {B, I, O} (Begin, Inside, Outside).

Updated 2026-05-02

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Learn Before

Related

Learn After