Activity (Process)

Learning from Human Feedback

Learning from human feedback is an alignment method used after pre-training and supervised fine-tuning to address the risk of a model generating unfactual, biased, or harmful content. The process involves collecting human evaluations of the model's responses to various inputs, where experts assess the outputs based on their preferences and interests. This collected feedback is then utilized to further train the model, enhancing its alignment with user expectations.

0

1

Updated 2026-05-02

Contributors are:

Who are from:

Tags

Ch.2 Generative Models - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.4 Alignment - Foundations of Large Language Models