Learn Before
Concept

GPT-1 (Generative Pre-trained Transformer)

GPT-1, introduced by Radford et al. in 2018, is a generative pre-trained transformer model. A key contribution of this model was the introduction of the multi-head self-attention mechanism and the use of a transformer architecture instead of RNNs or CNNs, which set a new state-of-the-art. The model's loss function is based on approximating the probability of sentence construction.

0

1

Updated 2025-10-06

Tags

Data Science

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences