Multiple Choice

An agent is in a state s and must choose between two actions: A and B. According to the agent's current policy, it chooses action A with a 70% probability and action B with a 30% probability. The expected total future reward for taking action A from state s is +20. The expected total future reward for taking action B from state s is -10. Based on this information, which of the following statements correctly describes the relationship between the value of being in state s and the values of taking each action?

0

1

Updated 2025-09-28

Contributors are:

Who are from:

Tags

Data Science

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science