A policy gradient reinforcement learning algorithm with fuzzy function approximation

Dongbing Gu, Erfu Yang

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

3 Citations (Scopus)

Abstract

For complex systems, reinforcement learning has to be generalised from a discrete form to a continuous form due to large state or action spaces. In this paper, the generalisation of reinforcement learning to continuous state space is investigated by using a policy gradient approach. Fuzzy logic is used as a function approximation in the generalisation. To guarantee learning convergence, a policy approximator and a state action value approximator are employed for the reinforcement learning. Both of them are based on fuzzy logic. The convergence of the learning algorithm is justified.

LanguageEnglish
Title of host publicationIEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004
Place of PublicationPiscataway, NJ.
PublisherIEEE
Pages936-940
Number of pages5
ISBN (Print)0780386148
DOIs
Publication statusPublished - 2004
Event2004 IEEE International Conference on Robotics and Biomimetics, ROBIO 2004 - Shenyang, China
Duration: 22 Aug 200426 Aug 2004

Conference

Conference2004 IEEE International Conference on Robotics and Biomimetics, ROBIO 2004
CountryChina
CityShenyang
Period22/08/0426/08/04

Fingerprint

Reinforcement learning
Learning algorithms
Fuzzy logic
Large scale systems

Keywords

  • fuzzy Q-learning
  • policy gradient method
  • reinforcement learning
  • learning convergence
  • approximation theory
  • convergence of numerical methods
  • functions
  • fuzzy sets
  • large scale systems
  • learning algorithms
  • state space methods

Cite this

Gu, D., & Yang, E. (2004). A policy gradient reinforcement learning algorithm with fuzzy function approximation. In IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004 (pp. 936-940). Piscataway, NJ.: IEEE. https://doi.org/10.1109/ROBIO.2004.1521910
Gu, Dongbing ; Yang, Erfu. / A policy gradient reinforcement learning algorithm with fuzzy function approximation. IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004. Piscataway, NJ. : IEEE, 2004. pp. 936-940
@inproceedings{3637ca12463d4ad38c878dac3930f99b,
title = "A policy gradient reinforcement learning algorithm with fuzzy function approximation",
abstract = "For complex systems, reinforcement learning has to be generalised from a discrete form to a continuous form due to large state or action spaces. In this paper, the generalisation of reinforcement learning to continuous state space is investigated by using a policy gradient approach. Fuzzy logic is used as a function approximation in the generalisation. To guarantee learning convergence, a policy approximator and a state action value approximator are employed for the reinforcement learning. Both of them are based on fuzzy logic. The convergence of the learning algorithm is justified.",
keywords = "fuzzy Q-learning, policy gradient method, reinforcement learning, learning convergence, approximation theory, convergence of numerical methods, functions, fuzzy sets, large scale systems, learning algorithms, state space methods",
author = "Dongbing Gu and Erfu Yang",
year = "2004",
doi = "10.1109/ROBIO.2004.1521910",
language = "English",
isbn = "0780386148",
pages = "936--940",
booktitle = "IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004",
publisher = "IEEE",

}

Gu, D & Yang, E 2004, A policy gradient reinforcement learning algorithm with fuzzy function approximation. in IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004. IEEE, Piscataway, NJ., pp. 936-940, 2004 IEEE International Conference on Robotics and Biomimetics, ROBIO 2004, Shenyang, China, 22/08/04. https://doi.org/10.1109/ROBIO.2004.1521910

A policy gradient reinforcement learning algorithm with fuzzy function approximation. / Gu, Dongbing; Yang, Erfu.

IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004. Piscataway, NJ. : IEEE, 2004. p. 936-940.

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

TY - GEN

T1 - A policy gradient reinforcement learning algorithm with fuzzy function approximation

AU - Gu, Dongbing

AU - Yang, Erfu

PY - 2004

Y1 - 2004

N2 - For complex systems, reinforcement learning has to be generalised from a discrete form to a continuous form due to large state or action spaces. In this paper, the generalisation of reinforcement learning to continuous state space is investigated by using a policy gradient approach. Fuzzy logic is used as a function approximation in the generalisation. To guarantee learning convergence, a policy approximator and a state action value approximator are employed for the reinforcement learning. Both of them are based on fuzzy logic. The convergence of the learning algorithm is justified.

AB - For complex systems, reinforcement learning has to be generalised from a discrete form to a continuous form due to large state or action spaces. In this paper, the generalisation of reinforcement learning to continuous state space is investigated by using a policy gradient approach. Fuzzy logic is used as a function approximation in the generalisation. To guarantee learning convergence, a policy approximator and a state action value approximator are employed for the reinforcement learning. Both of them are based on fuzzy logic. The convergence of the learning algorithm is justified.

KW - fuzzy Q-learning

KW - policy gradient method

KW - reinforcement learning

KW - learning convergence

KW - approximation theory

KW - convergence of numerical methods

KW - functions

KW - fuzzy sets

KW - large scale systems

KW - learning algorithms

KW - state space methods

UR - http://www.scopus.com/inward/record.url?scp=28344451638&partnerID=8YFLogxK

UR - http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=10204

UR - http://www.robio.org/

U2 - 10.1109/ROBIO.2004.1521910

DO - 10.1109/ROBIO.2004.1521910

M3 - Conference contribution book

SN - 0780386148

SP - 936

EP - 940

BT - IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004

PB - IEEE

CY - Piscataway, NJ.

ER -

Gu D, Yang E. A policy gradient reinforcement learning algorithm with fuzzy function approximation. In IEEE International Conference on Robotics and Biomimetics, 2004. ROBIO 2004. Piscataway, NJ.: IEEE. 2004. p. 936-940 https://doi.org/10.1109/ROBIO.2004.1521910