1Cademy - An AI alignment team is evaluating a language models response using three distinct reward models: Helpfulness, Harmlessness, and Conciseness. For a specific response, the models provide the following scores and are assigned the following weights: - Helpfulness: Score = 8.0, Weight = 2.0 - Harmlessness: Score = 9.0, Weight = 3.0 - Conciseness: Score = 6.0, Weight = 1.0 Using the weighted average formula for combining rewards, what is the final aggregated reward score for this response? (Assume

Learn Before

Combined Reward Formula

Multiple Choice

An AI alignment team is evaluating a language model's response using three distinct reward models: Helpfulness, Harmlessness, and Conciseness. For a specific response, the models provide the following scores and are assigned the following weights:

Helpfulness: Score = 8.0, Weight = 2.0
Harmlessness: Score = 9.0, Weight = 3.0
Conciseness: Score = 6.0, Weight = 1.0

Using the weighted average formula for combining rewards, what is the final aggregated reward score for this response? (Assume

Updated 2025-09-26

Contributors are:

Who are from:

Learn Before

Related