1Cademy - The Challenges and Possible Solutions

Learn Before

Reinforcement Learning in Text Games

Concept

The Challenges and Possible Solutions

Partial Observability: When the agents observe the information which only represents a small part of the underlying truth. Requires extensive exploration and failure to understand the connection between the current observations and related actions. Additional challenge is when the agent changes to next state without completing pre-requisite of future states.

Eg: Agent has use lantern to light its way but prior to this it needs to check if it has acquired the lantern in the previous state or not.

Possible Solution - Handcrafted Reward functions, using combination of a patience parameter and intrinsic motivation for new knowledge, dynamically learned knowledge graph

Large State Space: RL Problems requires exploration to find better solution however this can cause Natural Overfitting. For instance, setting reward signal for encountering new state removes the capability of learning new knowledge of the environment in favor of simply searching for unseen states.

Eg: Treasure Hunter - used increase in complexity of the environment with obstacles to deal with this problem.

Normally, the solutions are related to the general problem of RL - Exploration vs Exploitation

Some additional problems include Large, Combinatorial and Sparse Actions Spaces Long-Term Credit Assignment Understanding Parser Feedback & Language Acquisition Commonsense Reasoning & Affordance Extraction Knowledge Representation

Updated 2022-08-14

Contributors are:

Vidheesh Kumar Nacode

🏆 1

Who are from:

Syracuse University

🏆 1

References

A Survey of Text Games for Reinforcement Learning informed by Natural Language

Learn Before

Related