1Cademy - Generalization of the Language Modeling Concept

Approach 1: Processes the input text sequentially, token by token, updating an internal state that is passed from one step to the next.
Approach 2: Processes all input tokens simultaneously, using a mechanism that directly relates every token to every other token in the input to determine context.

Learn Before

Language Models (LMs)
Transformer

Concept

Generalization of the Language Modeling Concept

Alongside the rise of the Transformer architecture, the concept of language modeling was generalized to encompass models that learn to predict words in various ways, rather than strictly predicting the next token in a sequence. Many powerful Transformer-based models were pre-trained using these diverse word prediction tasks and successfully applied to a wide variety of downstream tasks.

Updated 2026-04-18

Contributors are: