Learn Before
InstructGPT
InstructGPT is a language model derived from GPT-3 that was explicitly fine-tuned to align with human intent. It utilizes reinforcement learning from human feedback (RLHF) to follow a diverse set of instructions more effectively than its base model.
0
1
Tags
D2L
Dive into Deep Learning @ D2L
Related
A research institution is planning to develop a new language model with approximately 175 billion parameters. Based on the characteristics of a model of this magnitude, which of the following represents the most significant trade-off the institution must evaluate?
A 2020 research paper by Brown et al. introduced a generative pre-trained transformer model that was particularly groundbreaking. What was the most defining characteristic of this model that set it apart from its direct predecessors?
The largest version of the generative pre-trained transformer model introduced in 2020 by Brown et al. is notable for its scale, containing ____ parameters.
Performance Scaling in GPT-3
GPT-4
InstructGPT