An agent is being trained to play tic-tac-toe on a 3x3 grid. The agent's set of possible moves is defined as {'place X on square 1', 'place X on square 2', ..., 'place X on square 9'}. Critically evaluate this set of moves. Identify one significant flaw and explain why it is a problem for the agent's learning process.

Google

An action is the set of all possible moves the agent can make. Agents choose from a list of possible actions. In video games, for example, the list might include: running right or left, jumping high or low, crouching or standing still.

Action in Reinforcement Learning

When applying reinforcement learning to Large Language Models, an action, denoted as $$a$$, corresponds to a possible decision the agent can make. Specifically, an action represents a predicted token chosen from the model's vocabulary.

Action in the Context of LLMs

A simple robotic arm is being trained to sort objects on a conveyor belt. The arm can perform only three distinct movements from its resting position: it can pick up an object, it can place an object in a bin, or it can do nothing and wait. In this learning scenario, what does the set {pick up, place, wait} represent?

Evaluating an Agent's Action Set

Read the following scenario and identify the complete set of possible moves the agent can make at each decision point.

Learn Before

Related