Abstract
For complex systems, reinforcement learning has to be generalised from a discrete form to a continuous form due to large state or action spaces. In this paper, the generalisation of reinforcement learning to continuous state space is investigated by using a policy gradient approach. Fuzzy logic is used as a function approximation in the generalisation. To guarantee learning convergence, a policy approximator and a state action value approximator are employed for the reinforcement learning. Both of them are based on fuzzy logic. The convergence of the learning algorithm is justified.
Original language | English |
---|---|
Title of host publication | IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004 |
Place of Publication | Piscataway, NJ. |
Publisher | IEEE |
Pages | 936-940 |
Number of pages | 5 |
ISBN (Print) | 0780386148 |
DOIs | |
Publication status | Published - 2004 |
Event | 2004 IEEE International Conference on Robotics and Biomimetics, ROBIO 2004 - Shenyang, China Duration: 22 Aug 2004 → 26 Aug 2004 |
Conference
Conference | 2004 IEEE International Conference on Robotics and Biomimetics, ROBIO 2004 |
---|---|
Country/Territory | China |
City | Shenyang |
Period | 22/08/04 → 26/08/04 |
Keywords
- fuzzy Q-learning
- policy gradient method
- reinforcement learning
- learning convergence
- approximation theory
- convergence of numerical methods
- functions
- fuzzy sets
- large scale systems
- learning algorithms
- state space methods