1Cademy - Improving a Sarcasm-Detecting AI

Supervised Approach: Have human experts write ideal, &#x27;gold-standard&#x27; responses to a wide range of prompts for the model to imitate.
Preference-Based Approach: Have the model generate multiple responses to each prompt, and then have human experts rank these responses from best to worst.

Learn Before

Justification for Using RLHF over Supervised Learning

Case Study

Improving a Sarcasm-Detecting AI

Analyze the training methodology described in the following scenario. Identify its fundamental weakness for the given task and propose an alternative data collection strategy that would be more effective. Justify your proposal by explaining how it addresses the core problem.

Updated 2025-10-05

Contributors are: