1Cademy - Data Collection for Reward Modeling in RLHF

Learn Before

Model Initialization Strategy in RLHF

Activity (Process)

Data Collection for Reward Modeling in RLHF

The initial step in training a reward model is to gather a dataset of human feedback. This process starts by using a Large Language Model to generate several different candidate outputs for a given input prompt, x. These outputs are then evaluated by humans, and their feedback can be collected using a variety of methods.

Updated 2025-10-10

Contributors are: