Interpreting Contextual Representations
Based on the training method described in the case study, explain the fundamental reason why the model generates distinct internal representations for the same word ('rock') in different sentences.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Interpreting Contextual Representations
A language model is trained on the following sentence where one word is hidden: 'The scientist meticulously calibrated the [MASK] before the experiment.' The model's primary training objective is to predict the hidden word. Why is this task effective for teaching the model to understand the relationships between words?
A language model is trained using an objective where it must predict randomly hidden words in a sentence based on the surrounding words. After this training, the model will generate the exact same numerical representation for the word 'bank' in the sentences 'He sat on the river bank' and 'She withdrew money from the bank'.