1Cademy - Analysis of Input Corruption Impact

Learn Before

Token Deletion as an Input Corruption Method

Short Answer

Analysis of Input Corruption Impact

Consider two different methods for corrupting an input sentence for a language model's training. Method 1 replaces certain words with a generic placeholder symbol, keeping the sentence length the same. Method 2 completely removes certain words, resulting in a shorter sentence. Analyze the unique challenge that Method 2 presents to a model in learning the grammatical structure of a language, compared to Method 1.

Updated 2025-10-02

Contributors are:

Who are from:

Learn Before

Related