Contextual Representation Analysis
A team is using a pre-trained sequence encoding model and inputs two different sentences. They observe that the numerical vector generated for the word 'close' is significantly different between the two outputs. Based on the principles of how these models generate representations, explain this phenomenon.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Equation for Generating Sequence Representations
Probability Distribution Formula for an Encoder-Softmax Language Model
A pre-trained sequence encoding model processes the input sentence 'The quick fox'. After tokenization, the input is a sequence of 3 tokens: {'The', 'quick', 'fox'}. The model then generates a numerical representation, H, which is a matrix of real-valued vectors. Based on the typical function of such a model, which statement best describes the output matrix H?
Contextual Representation Analysis
Consider a pre-trained sequence encoding model that generates a numerical representation H = {h_0, h_1, ..., h_m} for an input sequence of tokens x = {x_0, x_1, ..., x_m}. The vector h_i representing the token x_i will be the same regardless of the other tokens that appear alongside it in the input sequence.