Learn Before
True/False

Consider a standard pre-training procedure for a language model where 15% of all tokens in an input are first selected for prediction. Of these selected tokens, 80% are then replaced with a special [MASK] symbol. Based on this procedure, it is guaranteed that for any given input sequence of 1,000 tokens, exactly 120 tokens will be replaced with the [MASK] symbol.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science