Learn Before
Why Function Approximation is Needed?
For traditional Q-learning method, there might be no need to train a reward function. For these methods, a reward form is kept to record the reward for every (state, action) pair. However, to deal with the curse of the dimensionality, a function is provided instead of the form. We try to use the targeted Q value used in the updating process of Q-learning as the tag to train a predict network for that. That's exactly why function approximation is needed. If there exists a well-formed policy to search through (action, state) space to get the best action or fairly small (action, state) space, there is no need to do a function approximation.
0
2
Tags
Data Science
Related
Reward vs. Value Function
Rewards, Returns and Value functions
Why Function Approximation is Needed?
Bellman Equation
Reward Function in Reinforcement Learning
Sparse Rewards in NLP
Reward Models as the Basis for Value Functions
An autonomous agent is being trained to navigate a maze and reach a specific exit. The agent receives a small negative feedback signal (-0.1) for every step it takes and a large positive feedback signal (+100) only when it reaches the correct exit. The agent's goal is to maximize its total feedback score. Given this feedback structure, what is the most likely reason the agent might fail to learn to solve the maze, even after many attempts?
Evaluating Reward Structures for a Chatbot
Designing a Reward System for a Robot Dog