1Cademy - Formula for Input Embedding Composition

Learn Before

T5 Sample Format

Formula

Formula for Input Embedding Composition

The input embedding for a model, denoted as $e$ , is calculated by summing three component vectors: $x$ , $e_{pos}$ , and $e_{seg}$ . This operation is represented by the formula: $e = x + e_{pos} + e_{seg}$ .

Updated 2025-10-08

Contributors are:

Who are from:

References

Reference of Foundations of Large Language Models Course

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Example of a T5 Machine Translation Training Sample with Special Tokens
Example of a T5 Question-Answering Sample
Example of a T5 Simplification Task Sample
Differentiating Encoder and Decoder Sequences with Start Symbols
Versatility of the T5 Text-to-Text Format
Definition of c_gold
Formula for Input Embedding Composition
A researcher wants to train a model to perform a new task: converting a sentence from passive voice to active voice. Given the passive input sentence 'The cake was eaten by the dog' and the desired active output 'The dog ate the cake', which of the following training samples is correctly structured according to the unified, prefix-based text-to-text format?
Critiquing a Text-to-Text Training Sample
A single text-to-text model is being trained on a dataset containing samples for four different tasks. Each sample's input begins with a prefix that instructs the model on what to do. Match each input sample (Source Text) with the most likely task it is intended for.

Learn After

A researcher designs a language model where the final input representation for each word is created by summing a vector for the word's identity and a vector for the sentence it belongs to. However, they intentionally omit the vector that encodes the word's specific position in the sequence. What is the most likely deficiency this model will exhibit?
Calculating a Final Input Embedding
A common method for creating the final input representation for a token in a sequence involves summing three distinct vectors. Match each vector component to its specific function in this process.

Learn Before

Related

Learn After