We present an active reinforcement learning framework for unmanned aerial vehicle (UAV) first view landmark localization. We formulate the problem of landmark localization as that of a Markov decision process and introduce an active landmark-localization network (ALLNet) to address it. The aim of the ALLNet is to locate a bounding box that surrounds the landmark in a first view image sequence. To this end, it is trained in a reinforcement learning fashion. Specifically, it employs support vector machine (SVM) scores on the bounding box patches as rewards and learns the bounding box transformations as actions. Furthermore, each SVM score indicates whether or not the landmark is detected by the bounding box such that it enables the ALLNet to have the capability of judging whether the landmark leaves or re-enters a first view image. Therefore, the operation of the ALLNet is not only dominated by the reinforcement learning process but also supplemented by an active learning motivated manner. Once the landmark is considered to leave the first view image, the ALLNet stops operating until the SVM detects its re-entry to the view. The active reinforcement learning model enables training a robust ALLNet for landmark localization. The experimental results validate the effectiveness of the proposed model for UAV first view landmark localization.
- reinforcement learning
- first view landmark localization
- unmanned aerial vehicle