Input Formulation for the RLHF Reward Model
To evaluate a response, the reward model processes a sequence created by concatenating the original input prompt with the generated output . This combined sequence is formally denoted as , which is then fed into the model from left to right to derive its representation.
0
1
References
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Reference of Foundations of Large Language Models Course
Tags
Ch.2 Generative Models - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Ch.4 Alignment - Foundations of Large Language Models
Related
Pair-wise Ranking Loss Formula for RLHF Reward Model
Input Formulation for the RLHF Reward Model
Diagram of Reward Score Calculation using an LLM
An engineer is implementing a reward model by adapting a pre-trained language model. After feeding a concatenated prompt and response sequence into the model, they have access to the final layer's hidden state vector for each token in the sequence. To derive a single scalar reward score from these vectors, which of the following procedures should they implement?
You are tasked with implementing a reward model to score a response generated for a given prompt. Arrange the following steps in the correct chronological order to transform the prompt-response pair into a final scalar reward score.
Reward Model Implementation Analysis
Learn After
Sequence Representation for Reward Calculation in RLHF
A team is developing a model to automatically assign a quality score to an AI-generated response. To do this, the model must be given some text as input. Which of the following best explains why the model should be given the original prompt concatenated with the AI's response, instead of just the AI's response alone?
Reward Model Input Preparation
Debugging a Reward Model's Input Formulation