A novel update mechanism for Q-Networks based on extreme learning machines

Research output: Chapter in Book/Report/Conference proceedingConference contribution book

3 Citations (Scopus)
27 Downloads (Pure)

Abstract

Reinforcement learning is a popular machine learning paradigm which can find near optimal solutions to complex problems. Most often, these procedures involve function approximation using neural networks with gradient based updates to optimise weights for the problem being considered. While this common approach generally works well, there are other update mechanisms which are largely unexplored in reinforcement learning. One such mechanism is Extreme Learning Machines. These were initially proposed to drastically improve the training speed of neural networks and have since seen many applications. Here we attempt to apply extreme learning machines to a reinforcement learning problem in the same manner as gradient based updates. This new algorithm is called Extreme Q-Learning Machine (EQLM). We compare its performance to a typical Q-Network on the cart-pole task - a benchmark reinforcement learning problem - and show EQLM has similar long-term learning performance to a Q-Network.
Original languageEnglish
Title of host publication2020 International Joint Conference on Neural Networks (IJCNN)
Place of PublicationPiscataway, NJ.
PublisherIEEE
Number of pages7
ISBN (Electronic)9781728169262
ISBN (Print)9781728169279
DOIs
Publication statusPublished - 28 Sept 2020
EventIEEE World Congress on Computational Intelligence 2020 - Glasgow, United Kingdom
Duration: 19 Jul 202024 Jul 2020
https://wcci2020.org/

Conference

ConferenceIEEE World Congress on Computational Intelligence 2020
Abbreviated titleWCCI
Country/TerritoryUnited Kingdom
CityGlasgow
Period19/07/2024/07/20
Internet address

Keywords

  • reinforcement learning
  • extreme learning machine (ELM)
  • neural network

Fingerprint

Dive into the research topics of 'A novel update mechanism for Q-Networks based on extreme learning machines'. Together they form a unique fingerprint.

Cite this