Learn Before
Concept
Megatron-Turing NLG
The Megatron-Turing NLG is a -billion-parameter large language model that was trained using the GPT-2 Transformer decoder architecture on a dataset of billion tokens.
0
1
Updated 2026-05-15
Tags
D2L
Dive into Deep Learning @ D2L
Related
The creators of the large-scale, unsupervised language model introduced in 2019 initially withheld the full version from the public, citing concerns about potential misuse. Which statement best evaluates the significance of this 'staged release' strategy for the field of artificial intelligence?
Analysis of Model Scaling Impact
Evaluating Model Capabilities in a Research Scenario
In-Context Learning
In-Context Learning (ICL)
Megatron-Turing NLG
Gopher
CLIP (Contrastive Language-Image Pre-training)