High-resolution aerial imagery semantic labeling with dense pyramid network

Xuran Pan, Lianru Gao, Bing Zhang, Fan Yang, Wenzhi Liao

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Semantic segmentation of high-resolution aerial images is of great importance in certain fields, but the increasing spatial resolution brings large intra-class variance and small inter-class differences that can lead to classification ambiguities. Based on high-level contextual features, the deep convolutional neural network (DCNN) is an effective method to deal with semantic segmentation of high-resolution aerial imagery. In this work, a novel dense pyramid network (DPN) is proposed for semantic segmentation. The network starts with group convolutions to deal with multi-sensor data in channel wise to extract feature maps of each channel separately; by doing so, more information from each channel can be preserved. This process is followed by the channel shuffle operation to enhance the representation ability of the network. Then, four densely connected convolutional blocks are utilized to both extract and take full advantage of features. The pyramid pooling module combined with two convolutional layers are set to fuse multi-resolution and multi-sensor features through an effective global scenery prior manner, producing the probability graph for each class. Moreover, the median frequency balanced focal loss is proposed to replace the standard cross entropy loss in the training phase to deal with the class imbalance problem. We evaluate the dense pyramid network on the International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam 2D semantic labeling dataset, and the results demonstrate that the proposed framework exhibits better performances, compared to the state of the art baseline.
LanguageEnglish
Article number3774
Pages1-15
Number of pages15
JournalSensors
Volume18
Issue number11
DOIs
Publication statusPublished - 5 Nov 2018

Fingerprint

aerial photography
semantics
Imagery (Psychotherapy)
pyramids
Semantics
Labeling
marking
Antennas
high resolution
Photogrammetry
Aptitude
Sensors
Entropy
Electric fuses
Convolution
photogrammetry
fuses
sensors
Remote sensing
convolution integrals

Keywords

  • high-resolution aerial imageries
  • semantic segmentation
  • densely connected convolutions
  • pyramid pooling module

Cite this

Pan, Xuran ; Gao, Lianru ; Zhang, Bing ; Yang, Fan ; Liao, Wenzhi. / High-resolution aerial imagery semantic labeling with dense pyramid network. In: Sensors. 2018 ; Vol. 18, No. 11. pp. 1-15.
@article{b2f631527ff84f9f8a8ed510f9eaf415,
title = "High-resolution aerial imagery semantic labeling with dense pyramid network",
abstract = "Semantic segmentation of high-resolution aerial images is of great importance in certain fields, but the increasing spatial resolution brings large intra-class variance and small inter-class differences that can lead to classification ambiguities. Based on high-level contextual features, the deep convolutional neural network (DCNN) is an effective method to deal with semantic segmentation of high-resolution aerial imagery. In this work, a novel dense pyramid network (DPN) is proposed for semantic segmentation. The network starts with group convolutions to deal with multi-sensor data in channel wise to extract feature maps of each channel separately; by doing so, more information from each channel can be preserved. This process is followed by the channel shuffle operation to enhance the representation ability of the network. Then, four densely connected convolutional blocks are utilized to both extract and take full advantage of features. The pyramid pooling module combined with two convolutional layers are set to fuse multi-resolution and multi-sensor features through an effective global scenery prior manner, producing the probability graph for each class. Moreover, the median frequency balanced focal loss is proposed to replace the standard cross entropy loss in the training phase to deal with the class imbalance problem. We evaluate the dense pyramid network on the International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam 2D semantic labeling dataset, and the results demonstrate that the proposed framework exhibits better performances, compared to the state of the art baseline.",
keywords = "high-resolution aerial imageries, semantic segmentation, densely connected convolutions, pyramid pooling module",
author = "Xuran Pan and Lianru Gao and Bing Zhang and Fan Yang and Wenzhi Liao",
year = "2018",
month = "11",
day = "5",
doi = "10.3390/s18113774",
language = "English",
volume = "18",
pages = "1--15",
journal = "Sensors",
issn = "1424-8220",
number = "11",

}

High-resolution aerial imagery semantic labeling with dense pyramid network. / Pan, Xuran; Gao, Lianru; Zhang, Bing; Yang, Fan; Liao, Wenzhi.

In: Sensors, Vol. 18, No. 11, 3774, 05.11.2018, p. 1-15.

Research output: Contribution to journalArticle

TY - JOUR

T1 - High-resolution aerial imagery semantic labeling with dense pyramid network

AU - Pan, Xuran

AU - Gao, Lianru

AU - Zhang, Bing

AU - Yang, Fan

AU - Liao, Wenzhi

PY - 2018/11/5

Y1 - 2018/11/5

N2 - Semantic segmentation of high-resolution aerial images is of great importance in certain fields, but the increasing spatial resolution brings large intra-class variance and small inter-class differences that can lead to classification ambiguities. Based on high-level contextual features, the deep convolutional neural network (DCNN) is an effective method to deal with semantic segmentation of high-resolution aerial imagery. In this work, a novel dense pyramid network (DPN) is proposed for semantic segmentation. The network starts with group convolutions to deal with multi-sensor data in channel wise to extract feature maps of each channel separately; by doing so, more information from each channel can be preserved. This process is followed by the channel shuffle operation to enhance the representation ability of the network. Then, four densely connected convolutional blocks are utilized to both extract and take full advantage of features. The pyramid pooling module combined with two convolutional layers are set to fuse multi-resolution and multi-sensor features through an effective global scenery prior manner, producing the probability graph for each class. Moreover, the median frequency balanced focal loss is proposed to replace the standard cross entropy loss in the training phase to deal with the class imbalance problem. We evaluate the dense pyramid network on the International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam 2D semantic labeling dataset, and the results demonstrate that the proposed framework exhibits better performances, compared to the state of the art baseline.

AB - Semantic segmentation of high-resolution aerial images is of great importance in certain fields, but the increasing spatial resolution brings large intra-class variance and small inter-class differences that can lead to classification ambiguities. Based on high-level contextual features, the deep convolutional neural network (DCNN) is an effective method to deal with semantic segmentation of high-resolution aerial imagery. In this work, a novel dense pyramid network (DPN) is proposed for semantic segmentation. The network starts with group convolutions to deal with multi-sensor data in channel wise to extract feature maps of each channel separately; by doing so, more information from each channel can be preserved. This process is followed by the channel shuffle operation to enhance the representation ability of the network. Then, four densely connected convolutional blocks are utilized to both extract and take full advantage of features. The pyramid pooling module combined with two convolutional layers are set to fuse multi-resolution and multi-sensor features through an effective global scenery prior manner, producing the probability graph for each class. Moreover, the median frequency balanced focal loss is proposed to replace the standard cross entropy loss in the training phase to deal with the class imbalance problem. We evaluate the dense pyramid network on the International Society for Photogrammetry and Remote Sensing (ISPRS) Vaihingen and Potsdam 2D semantic labeling dataset, and the results demonstrate that the proposed framework exhibits better performances, compared to the state of the art baseline.

KW - high-resolution aerial imageries

KW - semantic segmentation

KW - densely connected convolutions

KW - pyramid pooling module

U2 - 10.3390/s18113774

DO - 10.3390/s18113774

M3 - Article

VL - 18

SP - 1

EP - 15

JO - Sensors

T2 - Sensors

JF - Sensors

SN - 1424-8220

IS - 11

M1 - 3774

ER -