Document Rotation as an Input Corruption Method
Document rotation is an input corruption method where the primary objective is for a model to identify the original start of a sequence. The process begins by randomly selecting a token from the input text. The entire sequence is then rotated so that this selected token is positioned at the beginning, creating a corrupted version. The model is then trained on this rotated sequence to predict which token was originally the first one.
0
1
References
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Sentence Reordering as an Input Corruption Method
Document Rotation as an Input Corruption Method
A research team is training a model on multi-paragraph documents. Their primary goal is to ensure the model learns the logical flow and coherence between sentences, not just the relationships between words within a single sentence. Which of the following input corruption strategies is specifically designed to target this higher-level, inter-sentence understanding?
Rationale for Sentence-Level Corruption
A language model is being trained using a denoising objective, where it learns to reconstruct original text from a corrupted version. Match each type of input corruption with the primary linguistic feature it forces the model to learn.
Learn After
Example of Document Rotation
A self-supervised learning task involves modifying an input sequence by selecting a token and rearranging the sequence so that the selected token becomes the new starting point. The part of the sequence that originally came before the selected token is moved to the end. Given the original sequence 'Hard work leads to success .', if the token 'leads' is chosen as the new starting point, what is the resulting modified sequence?
Reconstructing Original Sequence from Rotated Input
A language model is being trained using a technique where an input document is 'rotated'. For example, an original document is transformed into the following sequence: 'leads to success . Success brings happiness . Hard work'. What is the primary objective for the model when presented with this transformed input?
Example of Document Rotation in Denoising Autoencoding