Concept

Role of the [CLS] Token in Sequence Classification

The special [CLS][\mathrm{CLS}] token is typically prepended as the start symbol, denoted as x0x_0, of an input sequence before it enters an encoder. While it marks the start at the input level, its primary utility is at the output stage. The encoder's final hidden state corresponding to this token, denoted as h0\mathbf{h}_0, is generally considered to be the representation of the entire sequence. Consequently, this aggregate vector is often fed into a classification layer, such as a Softmax layer, to construct systems for sequence-level tasks like binary classification.

0

1

Updated 2026-04-18

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences