Consider a pre-trained sequence encoding model that generates a numerical representation H = {h_0, h_1, ..., h_m} for an input sequence of tokens x = {x_0, x_1, ..., x_m}. The vector h_i representing the token x_i will be the same regardless of the other tokens that appear alongside it in the input sequence.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Equation for Generating Sequence Representations
Probability Distribution Formula for an Encoder-Softmax Language Model
A pre-trained sequence encoding model processes the input sentence 'The quick fox'. After tokenization, the input is a sequence of 3 tokens: {'The', 'quick', 'fox'}. The model then generates a numerical representation, H, which is a matrix of real-valued vectors. Based on the typical function of such a model, which statement best describes the output matrix H?
Contextual Representation Analysis
Consider a pre-trained sequence encoding model that generates a numerical representation H = {h_0, h_1, ..., h_m} for an input sequence of tokens x = {x_0, x_1, ..., x_m}. The vector h_i representing the token x_i will be the same regardless of the other tokens that appear alongside it in the input sequence.