1Cademy - Permuted Language Modeling

Learn Before

Masked Language Modeling (MLM)
Auto-Regressive (AR) Models

Concept

Permuted Language Modeling

Permuted Language Modeling is a training objective that builds upon the principles of Masked Language Modeling by incorporating the order of token prediction. The method involves shuffling the input sequence into a new order and then training the model to predict the tokens sequentially according to this permuted arrangement. For each step in the prediction process, the model uses a randomly chosen subset of other tokens from the sequence as its context.

Updated 2026-05-02

Contributors are: