1Cademy - Choosing a Feedback Collection Method

Learn Before

Pointwise Method (Rating) for Human Feedback in RLHF

Case Study

Choosing a Feedback Collection Method

A company is developing an AI assistant to perform factual, objective tasks, such as summarizing technical reports and extracting specific data points. They have a large team of annotators who can be trained to verify the accuracy and completeness of the AI's responses against source documents. The company needs a simple and scalable way to collect human feedback to train a reward model. Would a method where each annotator assigns an absolute quality score (e.g., on a 1-5 scale) to each individual AI response be an appropriate choice for this scenario? Justify your answer.

0

1

Updated 2025-10-10

Contributors are:

Who are from:

Learn Before

Related