Short Answer

Efficiency vs. Learning Trade-off in Denoising

When using a denoising autoencoding objective, a common technique to improve computational efficiency is to replace a contiguous span of several words in the input text with a single placeholder token. Explain the trade-off involved in choosing the length of the text spans to replace. Specifically, what is the benefit of using longer spans, and what is a potential negative consequence for the model's learning if the spans become too long?

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science