1Cademy - Optimizing Training Efficiency

Learn Before

Training Efficiency in Denoising Autoencoding

Case Study

Optimizing Training Efficiency

Based on the following scenario, propose a specific change to the team's input corruption strategy that would likely increase computational efficiency. Explain the principle that makes your proposed change effective.

Updated 2025-10-02

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

A machine learning engineer is training a model to reconstruct a document from a corrupted version. They are considering two different strategies for creating the corrupted input:
- Strategy A: Replace 15% of the words in the document, chosen at random, each with a single [MASK] token.
- Strategy B: Replace three separate, contiguous spans of words (which together make up 15% of the document's total words) with a single [SPAN] token for each span.
Assuming all other factors are
Optimizing Training Efficiency
Efficiency vs. Learning Trade-off in Denoising

Learn Before

Related