1Cademy - Auto-Regressive (AR) Models

Learn Before

Representative Transformer-based PLMs

Concept

Auto-Regressive (AR) Models

The main task of these models is to predict the next word based on previous words, and they are represented by the GPT model family. The AR infrastructures are composed of the Transformer’s decoder part, so a masking mechanism is used in the training phase, causing attention calculations to only see the content before a word. Applications are suitable for NLG tasks.

Updated 2025-10-06

Contributors are: