Concept

Sequence Representation for Reward Calculation in RLHF

To generate a single vector representing an entire prompt-response sequence in a reward model, the sequence is processed from left to right using forced decoding. Because language modeling restricts each position to only accessing its left context, the output from the top-most Transformer layer at the first position cannot encapsulate the full sequence. To resolve this, a special symbol, such as \s\langle \backslash s \rangle, is appended to the end of the sequence. The corresponding output vector from the Transformer layer stack at this final position is then used as the comprehensive representation of the entire sequence.

0

1

Updated 2026-04-20

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences