Diagnosing Contextualization Failure in Model Training
Based on the following case study, identify the core process within the model's encoder that is likely underperforming and explain why this failure leads to an inaccurate prediction.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A language model is being trained using the following modified input sequence:
[CLS] The sun is very [MASK] today . [SEP]. This sequence is converted into input embeddings and passed through a multi-layer encoder. Which of the following statements most accurately describes the final hidden state vector that corresponds to the[MASK]token after it has been processed by the encoder?A language model is being trained on the corrupted input sequence:
[CLS] The book was so [MASK] . [SEP]. Arrange the following steps in the correct chronological order, showing how the model processes this input to generate a representation suitable for predicting the masked word.Diagnosing Contextualization Failure in Model Training