Learn Before
Activity (Process)

End-to-End Pipeline for Text-Pair Classification

The complete process for text-pair classification involves several sequential steps. Initially, two texts are formatted into a single input sequence, typically prepended with a [CLS] token and separated by a [SEP] token. This token sequence is then transformed into a corresponding sequence of numerical embeddings. A Transformer encoder like BERT processes these embeddings to produce a sequence of contextualized hidden states, {h0,...,hmh_0, ..., h_m}. The hidden state h0h_0, corresponding to the [CLS] token, is selected as the aggregate representation for the entire text pair. Finally, this single vector is passed through a prediction network to generate the classification output.

Image 0

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.2 Generative Models - Foundations of Large Language Models