1Cademy - Evaluating Surrogate Objectives for a Mental Well-being AI

Learn Before

Surrogate Objectives in AI Alignment

Essay

Evaluating Surrogate Objectives for a Mental Well-being AI

An AI development team is tasked with modifying a social media platform's content recommendation algorithm. The true, intended objective is to 'improve the mental well-being of its users.' The team proposes three different measurable surrogate objectives to train the AI on. Evaluate the following three options. In your response, identify the most promising surrogate objective and justify your choice by analyzing the potential benefits and, more importantly, the potential negative consequences or failure modes of each option.

Maximize daily user engagement time on the platform.
Maximize the ratio of positive reactions (e.g., 'like', 'love') to negative reactions (e.g., 'angry') on content shown to the user.
Maximize scores on a voluntary, daily in-app survey asking users to rate their mood.

0

1

Updated 2025-10-06

Contributors are:

Who are from:

Learn Before

Related