Short Answer

Calculating Token Modifications in Pre-training

An input sequence for a language model contains 1,000 tokens. During a data corruption pre-training step, 15% of these tokens are randomly selected as prediction targets. These selected tokens are then modified according to a specific distribution: 80% are replaced with a special mask symbol, 10% are replaced with a random token, and 10% are left unchanged.

Based on this process, calculate the expected number of tokens in the sequence that will be: a) Replaced with a mask symbol. b) Replaced with a random token. c) Left unchanged among the selected group.

0

1

Updated 2025-10-05

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science