Concept

Surrogate Objectives in AI Alignment

A common strategy in AI alignment involves creating a 'surrogate objective,' which is a measurable, proxy goal designed to approximate the true, often more complex, intended objective. The AI system is then trained to optimize for this surrogate.

0

1

Updated 2026-05-03

Contributors are:

Who are from:

Tags

Ch.5 Inference - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Ch.4 Alignment - Foundations of Large Language Models