Learn Before
Calculating an Output Vector in a Simple Sequence Model
A simple sequence processing model generates an output vector, (y_i), for each input vector, (x_i), in a sequence. The output (y_i) is a weighted sum of all input vectors up to and including (x_i). The weights are determined by a three-step process:
- Scoring: A score is calculated between the current input (x_i) and each preceding input (x_j) (where (j \le i)) using a dot product: (score(x_i, x_j) = x_i \cdot x_j).
- Normalization: These scores are converted into weights, (\alpha_{ij}), by applying a softmax function across all (j \le i).
- Output Calculation: The output (y_i) is calculated as the weighted sum: (y_i = \sum_{j \le i} \alpha_{ij} x_j).
Your task is to apply this process to calculate the final output vector, (y_3), for the given sequence.
0
1
Tags
Data Science
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Application in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Parameter Matrices for Attention Transformations
Introduce weight matrices in the transformer
Calculating an Output Vector in a Simple Sequence Model
In a simple self-attention mechanism where similarity is measured by dot product and weights are normalized by a softmax function, if a current input vector
x_iis perfectly orthogonal to a preceding input vectorx_j, thenx_jwill have zero influence on the final output vectory_i.You are calculating the output vector
y_ifor a single input vectorx_iin a sequence using a simple self-attention mechanism that only considers preceding elements. Arrange the following computational steps in the correct chronological order.