Example

Illustration of BERT-based Architecture for Named Entity Recognition

This diagram illustrates a common architecture for Named Entity Recognition (NER) using BERT, which is a direct application of the sequence labeling approach. An input sequence of tokens (x1,x2,...,xmx_1, x_2, ..., x_m), prepended with a [CLS] token and appended with a [SEP] token, is converted into embeddings (eie_i) and fed into the BERT model. BERT processes the entire sequence and outputs a contextualized hidden state vector (hih_i) for each token. For the NER task, a separate classification layer is applied to each token's hidden state (e.g., h1h_1 through hmh_m) to predict a corresponding tag from a predefined set, such as {B, I, O} (Begin, Inside, Outside).

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Related