An engineer is implementing a reward model by adapting a pre-trained language model. After feeding a concatenated prompt and response sequence into the model, they have access to the final layer's hidden state vector for each token in the sequence. To derive a single scalar reward score from these vectors, which of the following procedures should they implement?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.2 Generative Models - Foundations of Large Language Models
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Pair-wise Ranking Loss Formula for RLHF Reward Model
Input Formulation for the RLHF Reward Model
Diagram of Reward Score Calculation using an LLM
An engineer is implementing a reward model by adapting a pre-trained language model. After feeding a concatenated prompt and response sequence into the model, they have access to the final layer's hidden state vector for each token in the sequence. To derive a single scalar reward score from these vectors, which of the following procedures should they implement?
You are tasked with implementing a reward model to score a response generated for a given prompt. Arrange the following steps in the correct chronological order to transform the prompt-response pair into a final scalar reward score.
Reward Model Implementation Analysis