Example of a T5 Machine Translation Training Sample with Special Tokens
A training sample for a machine translation task, such as Chinese to English, illustrates the T5 format. The sample includes a task-specific prefix, the input text, and the target translation. Special tokens structure the data for the model, for example: [CLS] Translate from Chinese to English: 你好! → ⟨s⟩ Hello!. In this format, [CLS] serves as the start symbol for the source text (encoder input), while <s> is the start symbol for the target text (decoder input).

0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example of a T5 Machine Translation Training Sample with Special Tokens
Example of a T5 Question-Answering Sample
Example of a T5 Simplification Task Sample
Differentiating Encoder and Decoder Sequences with Start Symbols
Versatility of the T5 Text-to-Text Format
Definition of c_gold
Formula for Input Embedding Composition
A researcher wants to train a model to perform a new task: converting a sentence from passive voice to active voice. Given the passive input sentence 'The cake was eaten by the dog' and the desired active output 'The dog ate the cake', which of the following training samples is correctly structured according to the unified, prefix-based text-to-text format?
Critiquing a Text-to-Text Training Sample
A single text-to-text model is being trained on a dataset containing samples for four different tasks. Each sample's input begins with a prefix that instructs the model on what to do. Match each input sample (Source Text) with the most likely task it is intended for.
Debugging Input Representation in a Sequence-to-Sequence Model
Example of a T5 Machine Translation Training Sample with Special Tokens
In designing a sequence-to-sequence model, an engineer decides to use one specific start symbol for all source sequences fed to the encoder and a different, unique start symbol for all target sequences fed to the decoder. Which statement best analyzes the primary benefit of this design choice?
In a sequence-to-sequence model, using a single, identical start symbol for both the source (encoder) and target (decoder) inputs would make it impossible for the model to distinguish between the two types of sequences and thus prevent it from learning the task.
Formulating NLP Tasks as Sequence-to-Sequence Mappings using Start Symbols
Learn After
A model is being trained on the following data sample for a translation task:
[CLS] Translate from Spanish to French: ¿Cómo estás? → <s> Comment vas-tu?Based on the structure and special tokens in this sample, what specific sequences are provided as input to the model's two main components?A data scientist is preparing a training sample for a text-to-text model designed for English to German translation. They create the following sample:
Translate from English to German: How are you? → <s> Wie geht es Ihnen? [CLS]Which of the following best describes the primary error in this sample's structure?Constructing a Training Sample for a Summarization Task