1Cademy - Final Reward Score Calculation in RLHF

Learn Before

Sequence Representation for Reward Calculation in RLHF

Activity (Process)

Final Reward Score Calculation in RLHF

Once a comprehensive vector representation of the concatenated input sequence is obtained from the Transformer layer stack, a final output layer, such as a linear transformation layer, is built directly on top of this representation. This layer translates the vector into a final scalar reward score, denoted by $R(\mathrm{seq}_k)$ or $R(\mathbf{x},\mathbf{y}_k)$ , representing the evaluation for the given prompt $\mathbf{x}$ and output $\mathbf{y}_k$ .