Multiple Choice

An AI agent is being updated. From a particular state, the original 'reference' version of the agent had a 40% chance of selecting action 'X'. The new 'current' version of the agent, after some training, now has only a 10% chance of selecting action 'X' from that same state. Based on this information, what can be concluded about the ratio of the current policy's probability to the reference policy's probability for taking action 'X'?

0

1

Updated 2025-09-26

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Application in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science