Learn Before
The Challenges and Possible Solutions
Partial Observability: When the agents observe the information which only represents a small part of the underlying truth. Requires extensive exploration and failure to understand the connection between the current observations and related actions. Additional challenge is when the agent changes to next state without completing pre-requisite of future states.
Eg: Agent has use lantern to light its way but prior to this it needs to check if it has acquired the lantern in the previous state or not.
Possible Solution - Handcrafted Reward functions, using combination of a patience parameter and intrinsic motivation for new knowledge, dynamically learned knowledge graph
Large State Space: RL Problems requires exploration to find better solution however this can cause Natural Overfitting. For instance, setting reward signal for encountering new state removes the capability of learning new knowledge of the environment in favor of simply searching for unseen states.
Eg: Treasure Hunter - used increase in complexity of the environment with obstacles to deal with this problem.
Normally, the solutions are related to the general problem of RL - Exploration vs Exploitation
Some additional problems include Large, Combinatorial and Sparse Actions Spaces Long-Term Credit Assignment Understanding Parser Feedback & Language Acquisition Commonsense Reasoning & Affordance Extraction Knowledge Representation
0
1
Tags
Natural language processing
Data Science