Short Answer

Calculating Transformation Matrix Dimensions

In a neural network component, the transformation matrix for each of its parallel processing units is defined by the formula WRd×dτ\mathbf{W} \in \mathbb{R}^{d \times \frac{d}{\tau}}, where dd is the model's embedding dimension and τ\tau is the number of parallel units. If a model has an embedding dimension (dd) of 768 and uses 12 parallel units (τ\tau), what are the dimensions of a single transformation matrix W\mathbf{W}? Show your calculation.

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science