Case Study

Evaluating a Policy Change

Based on the scenario provided, calculate the ratio of the new policy's action probability to the old policy's action probability. Then, explain what this ratio implies about how the observed reward should be used to evaluate the new policy.

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science