Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning

Junwei Han, Dingwen Zhang, Gong Cheng, Lei Guo, Jinchang Ren

Research output: Contribution to journalArticlepeer-review

513 Citations (Scopus)
2212 Downloads (Pure)


The abundant spatial and contextual information provided by the advanced remote sensing technology has facilitated subsequent automatic interpretation of the optical remote sensing images (RSIs). In this paper, a novel and effective geospatial object detection framework is proposed by combining the weakly supervised learning (WSL) and high-level feature learning. First, deep Boltzmann machine is adopted to infer the spatial and structural information encoded in the low-level and middle-level features to effectively describe objects in optical RSIs. Then, a novel WSL approach is presented to object detection where the training sets require only binary labels indicating whether an image contains the target object or not. Based on the learnt high-level features, it jointly integrates saliency, intraclass compactness, and interclass separability in a Bayesian framework to initialize a set of training examples from weakly labeled images and start iterative learning of the object detector. A novel evaluation criterion is also developed to detect model drift and cease the iterative learning. Comprehensive experiments on three optical RSI data sets have demonstrated the efficacy of the proposed approach in benchmarking with several state-of-the-art supervised-learning-based object detection approaches.
Original languageEnglish
Pages (from-to)3325-3337
Number of pages13
JournalIEEE Transactions on Geoscience and Remote Sensing
Issue number6
Early online date18 Dec 2014
Publication statusPublished - Jun 2015


  • Bayesian framework
  • deep Boltzmann machine
  • object detection
  • weakly supervised learning


Dive into the research topics of 'Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning'. Together they form a unique fingerprint.

Cite this