In a system designed to align a language model with human preferences, one component functions as a 'critic'. It takes the current state (e.g., a conversation history) as input and outputs a single scalar value predicting the total expected future rewards from that state. This component's architecture is often a large language model with a final linear layer for the scalar output. Which statement best distinguishes this specific component from others in the system?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
In a system designed to align a language model with human preferences, one component functions as a 'critic'. It takes the current state (e.g., a conversation history) as input and outputs a single scalar value predicting the total expected future rewards from that state. This component's architecture is often a large language model with a final linear layer for the scalar output. Which statement best distinguishes this specific component from others in the system?
Distinguishing Model Outputs in Preference Alignment
Diagnosing a Reinforcement Learning System