Example

Example of Using CoT in a Preference Labeling Prompt

An effective way to improve preference labeling is to incorporate a Chain-of-Thought (CoT) rationale within the prompt. For instance, the prompt could include an example that not only states a preference for one response over another but also provides a step-by-step explanation for this choice, guiding the labeler to apply similar critical reasoning.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Computing Sciences

Foundations of Large Language Models Course

Ch.4 Alignment - Foundations of Large Language Models