Learn Before
Example of Token Deletion in Denoising Autoencoding
In denoising autoencoding, token deletion involves training an encoder-decoder model to recover a complete sequence after certain words have been completely removed. For instance, if the original sequence is corrupted into [C] The kitten is the ball . (with the word 'chasing' deleted), the model is trained to generate the full, correct sentence: The kitten is chasing the ball ..
0
1
Tags
Foundations of Large Language Models
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Related
Example Comparison of Token Masking and Token Deletion
A language model is being trained to reconstruct an original text sequence from a corrupted version. During one training step, the original input is 'The quick brown fox jumps over the lazy dog.' and the corrupted input given to the model is 'The quick fox over the lazy dog.'. Based on this example, which specific input corruption technique was applied?
Analysis of Input Corruption Impact
When applying the token deletion method to corrupt an input sequence for model training, the length of the resulting sequence is identical to the original sequence.
Example of Token Deletion in Denoising Autoencoding