1Cademy - A development team is working on two separate improvement goals for their language model. Match each goal with the alignment methodology it primarily represents.

Learn Before

Differing Motivations of Instruction and Human Preference Alignment

Matching

A development team is working on two separate improvement goals for their language model. Match each goal with the alignment methodology it primarily represents.

Updated 2025-10-10

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science

A research team is refining a language model using two distinct methods. In Method A, they train the model on a large dataset of specific commands paired with ideal, human-written responses that perfectly execute those commands (e.g., Command: 'List three benefits of solar power.' Ideal Response: A list of exactly three benefits). In Method B, they show human raters two different model-generated responses to the same open-ended prompt (e.g., 'Write a short, encouraging note') and ask the raters to choose which response they prefer. The model is then updated based on these preferences. What fundamental difference in goals do these two methods represent?
Diagnosing Chatbot Performance Issues
A development team is working on two separate improvement goals for their language model. Match each goal with the alignment methodology it primarily represents.

Learn Before

Related