Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement

Yijun Yan, Jinchang Ren, Genyun Sun, Huimin Zhao, Junwei Han, Xuelong Li, Stephen Marshall, Jin Zhan

Research output: Contribution to journalArticle

39 Citations (Scopus)

Abstract

Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.
LanguageEnglish
Pages65-78
Number of pages14
JournalPattern Recognition
Volume79
Early online date5 Feb 2018
DOIs
Publication statusPublished - 31 Jul 2018

Fingerprint

Color
Electric fuses
Object detection

Keywords

  • background connectivity
  • Gestalt laws guided optimization
  • image saliency detection
  • feature fusion
  • human vision perception

Cite this

@article{73653afbf5c440f394c061f55bde3caa,
title = "Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement",
abstract = "Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.",
keywords = "background connectivity, Gestalt laws guided optimization, image saliency detection, feature fusion, human vision perception",
author = "Yijun Yan and Jinchang Ren and Genyun Sun and Huimin Zhao and Junwei Han and Xuelong Li and Stephen Marshall and Jin Zhan",
year = "2018",
month = "7",
day = "31",
doi = "10.1016/j.patcog.2018.02.004",
language = "English",
volume = "79",
pages = "65--78",
journal = "Pattern Recognition",
issn = "0031-3203",

}

Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement. / Yan, Yijun; Ren, Jinchang; Sun, Genyun; Zhao, Huimin; Han, Junwei; Li, Xuelong; Marshall, Stephen; Zhan, Jin.

In: Pattern Recognition, Vol. 79, 31.07.2018, p. 65-78.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement

AU - Yan, Yijun

AU - Ren, Jinchang

AU - Sun, Genyun

AU - Zhao, Huimin

AU - Han, Junwei

AU - Li, Xuelong

AU - Marshall, Stephen

AU - Zhan, Jin

PY - 2018/7/31

Y1 - 2018/7/31

N2 - Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.

AB - Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values.

KW - background connectivity

KW - Gestalt laws guided optimization

KW - image saliency detection

KW - feature fusion

KW - human vision perception

UR - https://www.sciencedirect.com/journal/pattern-recognition

U2 - 10.1016/j.patcog.2018.02.004

DO - 10.1016/j.patcog.2018.02.004

M3 - Article

VL - 79

SP - 65

EP - 78

JO - Pattern Recognition

T2 - Pattern Recognition

JF - Pattern Recognition

SN - 0031-3203

ER -