Learn Before
Using [SEP] Tokens for Sequence Concatenation
A common method for preparing data for sequence models involves concatenating input and output sequences into a single string, using a special separator token like [SEP] to distinguish between them. For instance, an input sequence xm and an output sequence y1 y2 ... yn can be formatted as xm [SEP] y1 y2 ... yn [SEP]. This structure allows the model to process both sequences simultaneously while understanding their distinct roles.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Probability of a Concatenated Token Sequence
An input sequence of tokens is defined as
x = (The, cat, sat)and a subsequent output sequence is defined asy = (on, the, mat). Which of the following correctly represents the single, combined sequence denoted by[x, y]?Using [SEP] Tokens for Sequence Concatenation
Deconstructing a Concatenated Token Sequence
Given two token sequences,
x = (start, process)andy = (end, result), the concatenated sequence denoted by[x, y]is identical to the sequence denoted by[y, x].
Learn After
A language model is being prepared for a question-answering task. The model must process both the question and its corresponding answer as a single, combined sequence. If the question is 'What is the capital of France?' and the answer is 'Paris', how should these two sequences be formatted for the model using a special separator token to distinguish between them?
Diagnosing Model Training Issues from Data Formatting
Debugging Data Preprocessing for a Summarization Model
Example of Sequence Packing for Translation