Short Answer

Component Analysis of the Combined Reward Formula

In the formula for a combined reward score, rcombine=1Kk=1Kwkrk(x,y)r_{\text{combine}} = \frac{1}{K} \sum_{k=1}^{K} w_k \cdot r_k(\mathbf{x}, \mathbf{y}), explain the distinct purpose of the weight term (wkw_k) compared to the normalization factor (1K\frac{1}{K}). Why is it important to have both in the calculation?

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science