Multiple Choice

An agent's goal is to navigate a simple environment and maximize its total reward. The agent is currently in a state 'S'. From this state, it can take one of two actions: 'Action 1' which consistently leads to a reward of +10, or 'Action 2' which consistently leads to a reward of -5. Consider two possible behavior patterns for the agent when it is in state 'S':

  • Behavior A: The agent chooses 'Action 1' with a 100% probability.
  • Behavior B: The agent chooses 'Action 1' with a 50% probability and 'Action 2' with a 50% probability.

Which behavior pattern is superior for achieving the agent's goal, and why?

0

1

Updated 2025-10-08

Contributors are:

Who are from:

Tags

Ch.4 Alignment - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Evaluation in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science