Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms

Xuran Pan, Fan Yang, Lianru Gao, Zhengchao Chen, Bing Zhang, Hairui Fan, Jinchang Ren

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Segmentation of high-resolution remote sensing images is an important challenge with wide practical applications. The increasing spatial resolution provides fine details for image segmentation but also incurs segmentation ambiguities. In this paper, we propose a generative adversarial network with spatial and channel attention mechanisms (GAN-SCA) for the robust segmentation of buildings in remote sensing images. The segmentation network (generator) of the proposed framework is composed of the well-known semantic segmentation architecture (U-Net) and the spatial and channel attention mechanisms (SCA). The adoption of SCA enables the segmentation network to selectively enhance more useful features in specific positions and channels and enables improved results closer to the ground truth. The discriminator is an adversarial network with channel attention mechanisms that can properly discriminate the outputs of the generator and the ground truth maps. The segmentation network and adversarial network are trained in an alternating fashion on the Inria aerial image labeling dataset and Massachusetts buildings dataset. Experimental results show that the proposed GAN-SCA achieves a higher score (the overall accuracy and intersection over the union of Inria aerial image labeling dataset are 96.61% and 77.75%, respectively, and the F 1 -measure of the Massachusetts buildings dataset is 96.36%) and outperforms several state-of-the-art approaches.

LanguageEnglish
Article number917
Number of pages18
JournalRemote Sensing
Volume11
Issue number8
DOIs
Publication statusPublished - 15 Apr 2019

Fingerprint

segmentation
imagery
Antennas
Labeling
Remote sensing
Discriminators
Image segmentation
Semantics
remote sensing
spatial resolution

Keywords

  • deep learning
  • generative adversarial network
  • high-resolution aerial images
  • inria aerial image labeling dataset
  • Massachusetts buildings dataset
  • semantic segmentation

Cite this

Pan, Xuran ; Yang, Fan ; Gao, Lianru ; Chen, Zhengchao ; Zhang, Bing ; Fan, Hairui ; Ren, Jinchang. / Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms. In: Remote Sensing. 2019 ; Vol. 11, No. 8.
@article{ce3d030daa514b38b37b6a0297c82e7e,
title = "Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms",
abstract = "Segmentation of high-resolution remote sensing images is an important challenge with wide practical applications. The increasing spatial resolution provides fine details for image segmentation but also incurs segmentation ambiguities. In this paper, we propose a generative adversarial network with spatial and channel attention mechanisms (GAN-SCA) for the robust segmentation of buildings in remote sensing images. The segmentation network (generator) of the proposed framework is composed of the well-known semantic segmentation architecture (U-Net) and the spatial and channel attention mechanisms (SCA). The adoption of SCA enables the segmentation network to selectively enhance more useful features in specific positions and channels and enables improved results closer to the ground truth. The discriminator is an adversarial network with channel attention mechanisms that can properly discriminate the outputs of the generator and the ground truth maps. The segmentation network and adversarial network are trained in an alternating fashion on the Inria aerial image labeling dataset and Massachusetts buildings dataset. Experimental results show that the proposed GAN-SCA achieves a higher score (the overall accuracy and intersection over the union of Inria aerial image labeling dataset are 96.61{\%} and 77.75{\%}, respectively, and the F 1 -measure of the Massachusetts buildings dataset is 96.36{\%}) and outperforms several state-of-the-art approaches.",
keywords = "deep learning, generative adversarial network, high-resolution aerial images, inria aerial image labeling dataset, Massachusetts buildings dataset, semantic segmentation",
author = "Xuran Pan and Fan Yang and Lianru Gao and Zhengchao Chen and Bing Zhang and Hairui Fan and Jinchang Ren",
year = "2019",
month = "4",
day = "15",
doi = "10.3390/rs11080966",
language = "English",
volume = "11",
journal = "Remote Sensing",
issn = "2072-4292",
number = "8",

}

Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms. / Pan, Xuran; Yang, Fan; Gao, Lianru; Chen, Zhengchao; Zhang, Bing; Fan, Hairui; Ren, Jinchang.

In: Remote Sensing, Vol. 11, No. 8, 917, 15.04.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms

AU - Pan, Xuran

AU - Yang, Fan

AU - Gao, Lianru

AU - Chen, Zhengchao

AU - Zhang, Bing

AU - Fan, Hairui

AU - Ren, Jinchang

PY - 2019/4/15

Y1 - 2019/4/15

N2 - Segmentation of high-resolution remote sensing images is an important challenge with wide practical applications. The increasing spatial resolution provides fine details for image segmentation but also incurs segmentation ambiguities. In this paper, we propose a generative adversarial network with spatial and channel attention mechanisms (GAN-SCA) for the robust segmentation of buildings in remote sensing images. The segmentation network (generator) of the proposed framework is composed of the well-known semantic segmentation architecture (U-Net) and the spatial and channel attention mechanisms (SCA). The adoption of SCA enables the segmentation network to selectively enhance more useful features in specific positions and channels and enables improved results closer to the ground truth. The discriminator is an adversarial network with channel attention mechanisms that can properly discriminate the outputs of the generator and the ground truth maps. The segmentation network and adversarial network are trained in an alternating fashion on the Inria aerial image labeling dataset and Massachusetts buildings dataset. Experimental results show that the proposed GAN-SCA achieves a higher score (the overall accuracy and intersection over the union of Inria aerial image labeling dataset are 96.61% and 77.75%, respectively, and the F 1 -measure of the Massachusetts buildings dataset is 96.36%) and outperforms several state-of-the-art approaches.

AB - Segmentation of high-resolution remote sensing images is an important challenge with wide practical applications. The increasing spatial resolution provides fine details for image segmentation but also incurs segmentation ambiguities. In this paper, we propose a generative adversarial network with spatial and channel attention mechanisms (GAN-SCA) for the robust segmentation of buildings in remote sensing images. The segmentation network (generator) of the proposed framework is composed of the well-known semantic segmentation architecture (U-Net) and the spatial and channel attention mechanisms (SCA). The adoption of SCA enables the segmentation network to selectively enhance more useful features in specific positions and channels and enables improved results closer to the ground truth. The discriminator is an adversarial network with channel attention mechanisms that can properly discriminate the outputs of the generator and the ground truth maps. The segmentation network and adversarial network are trained in an alternating fashion on the Inria aerial image labeling dataset and Massachusetts buildings dataset. Experimental results show that the proposed GAN-SCA achieves a higher score (the overall accuracy and intersection over the union of Inria aerial image labeling dataset are 96.61% and 77.75%, respectively, and the F 1 -measure of the Massachusetts buildings dataset is 96.36%) and outperforms several state-of-the-art approaches.

KW - deep learning

KW - generative adversarial network

KW - high-resolution aerial images

KW - inria aerial image labeling dataset

KW - Massachusetts buildings dataset

KW - semantic segmentation

UR - http://www.scopus.com/inward/record.url?scp=85065020057&partnerID=8YFLogxK

U2 - 10.3390/rs11080966

DO - 10.3390/rs11080966

M3 - Article

VL - 11

JO - Remote Sensing

T2 - Remote Sensing

JF - Remote Sensing

SN - 2072-4292

IS - 8

M1 - 917

ER -