In a pipeline designed for sentence-pair classification, an input like [CLS] sentence A [SEP] sentence B [SEP] is processed by an encoder to produce a sequence of contextualized encodings, one for each token. For the final classification, only the encoding corresponding to the [CLS] token is passed to a Softmax layer. What is the most accurate reason for selecting this specific encoding to represent the entire input?
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a pipeline designed for sentence-pair classification, an input like
[CLS] sentence A [SEP] sentence B [SEP]is processed by an encoder to produce a sequence of contextualized encodings, one for each token. For the final classification, only the encoding corresponding to the[CLS]token is passed to a Softmax layer. What is the most accurate reason for selecting this specific encoding to represent the entire input?A language model is being fine-tuned for a sentence-pair classification task (e.g., determining if one sentence is an entailment of another). Arrange the following steps into the correct sequence that describes the data processing pipeline, from the initial input to the final prediction.
Debugging a Sentence-Pair Classification Model