Learn Before
When training a language model using a framework that incorporates human feedback, standard reinforcement learning terminology is used. Match each reinforcement learning term on the left with its corresponding component or concept in this specific language model training context on the right.
0
1
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Policy in the Context of LLMs
LLM Policy as a Probability Distribution
Identifying the Agent and Action in a Training Scenario
When a language model is fine-tuned using a system that incorporates human preferences, this process is often conceptualized within a reinforcement learning framework. Which of the following statements correctly analyzes the components of this interaction?
When training a language model using a framework that incorporates human feedback, standard reinforcement learning terminology is used. Match each reinforcement learning term on the left with its corresponding component or concept in this specific language model training context on the right.