TY - JOUR
T1 - A robust learned feature-based visual odometry system for UAV pose estimation in challenging indoor environments
AU - Yu, Leijian
AU - Yang, Erfu
AU - Yang, Beiya
AU - Fei, Zixiang
AU - Niu, Cong
N1 - © 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting /republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
PY - 2023/5/24
Y1 - 2023/5/24
N2 - Unmanned aerial vehicles (UAVs) are becoming popular nowadays due to their versatility and flexibility for indoor applications, such as the autonomous visual inspection of the inner surface of a pressure vessel. Nevertheless, robust and reliable position estimation is critical for completing these tasks. Visual odometry (VO) and visual simultaneous localization and mapping (VSLAM) allow the UAV to estimate its position in unknown environments. However, traditional feature-based VO/VSLAM systems struggle to deal with complex scenes such as low illumination and textureless environments. Replacing the traditional features with deep learning-based features provides the advantage of handling the challenging environment, but the efficiency is ignored. In this work, an efficient VO system based on a novel lightweight feature extraction network for UAV onboard platforms has been developed. The deformable convolution (DFConv) is utilized to improve the feature extraction capability. Owing to the limited onboard computing capability, the depthwise separable convolution (DWConv) is adopted to calculate the offsets for the DFConv and construct the backbone network to improve the feature extraction efficiency. Experiments on public datasets indicate that the efficiency of the VO system is improved by 30.03% while preserving the accuracy on embedded platforms with the feature points and descriptors detected by the proposed convolutional neural network (CNN). Moreover, the proposed VO system is verified through UAV flying tests in a real-world scenario. The results prove that the proposed VO system is able to handle the challenging environments where both the latest traditional and deep learning feature-based VO/VSLAM systems fail, and it is feasible for UAV self-localization and autonomous navigation in the confined, low illumination and textureless indoor environment.
AB - Unmanned aerial vehicles (UAVs) are becoming popular nowadays due to their versatility and flexibility for indoor applications, such as the autonomous visual inspection of the inner surface of a pressure vessel. Nevertheless, robust and reliable position estimation is critical for completing these tasks. Visual odometry (VO) and visual simultaneous localization and mapping (VSLAM) allow the UAV to estimate its position in unknown environments. However, traditional feature-based VO/VSLAM systems struggle to deal with complex scenes such as low illumination and textureless environments. Replacing the traditional features with deep learning-based features provides the advantage of handling the challenging environment, but the efficiency is ignored. In this work, an efficient VO system based on a novel lightweight feature extraction network for UAV onboard platforms has been developed. The deformable convolution (DFConv) is utilized to improve the feature extraction capability. Owing to the limited onboard computing capability, the depthwise separable convolution (DWConv) is adopted to calculate the offsets for the DFConv and construct the backbone network to improve the feature extraction efficiency. Experiments on public datasets indicate that the efficiency of the VO system is improved by 30.03% while preserving the accuracy on embedded platforms with the feature points and descriptors detected by the proposed convolutional neural network (CNN). Moreover, the proposed VO system is verified through UAV flying tests in a real-world scenario. The results prove that the proposed VO system is able to handle the challenging environments where both the latest traditional and deep learning feature-based VO/VSLAM systems fail, and it is feasible for UAV self-localization and autonomous navigation in the confined, low illumination and textureless indoor environment.
KW - unmanned aerial vehicles
KW - visual odometry
KW - deep learning-based features
KW - depthwise separable convolution
KW - improved deformable convolution
U2 - 10.1109/TIM.2023.3279458
DO - 10.1109/TIM.2023.3279458
M3 - Article
SN - 0018-9456
VL - 72
SP - 1
EP - 11
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
M1 - 5015411
ER -