Evaluating a Sequence Representation Method
A common technique in sequence classification models is to prepend a special token to the input sequence and then use only the final hidden state corresponding to that token as the input for the final classification layer. Critically evaluate this approach. What is the primary advantage of this design, and what is a potential limitation?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Sequence Classification Pipeline using the [CLS] Token Output
Evaluating a Sequence Representation Method
A machine learning engineer is building a model to classify sentences as either 'question' or 'statement'. They add a special classification token to the beginning of each input sentence before passing it to an encoder. The encoder then produces a final hidden state vector for every token in the input. For the final classification step, which hidden state vector should be used as the representative summary of the entire sentence?
Debugging a Sequence Classification Model
In a sequence classification task, the special token prepended to the input is designed so that its initial vector representation, before being processed by the main model, contains a summary of the entire sequence's meaning.
Classification on Sequence Representation