Case Study

Analyzing Reward Function Equivalence

A robotics team is training an agent to navigate a grid. They test two different reward functions. Both functions result in the agent learning the exact same optimal path. Your task is to determine the transformation function, f, that relates the two reward functions according to the formula: r(st,at,st+1)=r(st,at,st+1)+f(st,at,st+1)r'(s_t, a_t, s_{t+1}) = r(s_t, a_t, s_{t+1}) + f(s_t, a_t, s_{t+1})

0

1

Updated 2025-10-04

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science