Enabling intelligent onboard guidance, navigation, and control using reinforcement learning on near-term flight hardware

Research output: Contribution to journalArticlepeer-review

21 Downloads (Pure)


Future space missions require technological advances to meet more stringent requirements. Next generation guidance, navigation, and control systems must safely operate autonomously in hazardous and uncertain environments. While these developments often focus on flight software, spacecraft hardware also creates computational limitations for onboard algorithms. Intelligent control methods combine theories from automatic control, artificial intelligence, and operations research to derive control systems capable of handling large uncertainties. While this can be beneficial for spacecraft control, such control systems often require substantial computational power. Recent improvements in single board computers have created physically lighter and less power-intensive processors that are suitable for spaceflight and purpose built for machine learning. In this study, we implement a reinforcement learning based controller on NVIDIA Jetson Nano hardware and apply this controller to a simulated Mars powered descent problem. The proposed approach uses optimal trajectories and guidance laws under nominal environment conditions to initialise a reinforcement learning agent. This agent learns a control policy to cope with environmental uncertainties and updates its control policy online using a novel update mechanism called Extreme Q-Learning Machine. We show that this control system performs well on flight suitable hardware, which demonstrates the potential for intelligent control onboard spacecraft.
Original languageEnglish
Pages (from-to)374-385
Number of pages12
JournalActa Astronautica
Early online date15 Jul 2022
Publication statusPublished - 31 Oct 2022


  • intelligent control
  • reinforcement learning
  • edge artificial intelligence


Dive into the research topics of 'Enabling intelligent onboard guidance, navigation, and control using reinforcement learning on near-term flight hardware'. Together they form a unique fingerprint.

Cite this