Learn Before
Case Study
Choosing an Alignment Strategy for a Startup
Given the startup's constraints, which method should they choose? Justify your recommendation by explaining the key trade-off between the two approaches.
0
1
Updated 2025-10-05
Tags
Ch.5 Inference - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Evaluation in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A development team aims to align a large language model with human preferences. Their methodology is as follows:
- For each input prompt, generate 16 different responses from the model.
- Use a pre-trained 'reward model' to assign a quality score to each of the 16 responses.
- Select only the single highest-scoring response for that prompt.
- Compile a new dataset consisting of thousands of these prompt-and-best-response pairs.
- Fine-tune the original language model on this new dataset using standard supervised learning methods.
Which statement most accurately evaluates this team's approach?
Choosing an Alignment Strategy for a Startup
Comparing Model Alignment Techniques