Automated weighted outlier detection technique for multivariate data

Suresh N. Thennadil, Mark Dewar, Craig Herdsman, Alison Nordon, Edo Becker

Research output: Contribution to journalArticle

6 Citations (Scopus)
59 Downloads (Pure)

Abstract

In the chemical and petrochemical industries, spectroscopy-based online analysers are becoming common for process monitoring and control applications. A significant challenge in using these analysers as part of process monitoring and control loops is the large amount of personnel time required for calibration and maintenance of models which involve decision inputs such as whether an observation is an outlier, the number of latent variables in a model, type of pre-processing and when a calibration model has to be updated. Since no one measure works well for all applications, supervision by the process data analyst is required which invariably involves some level of subjectivity. In this paper, we focus on the detection of multivariate outliers in a calibration set. We propose a method which combines multiple outlier detection techniques to identify a set of outlying observations without operator input. Apart from the overall methodology, this work introduces several novelties. The system uses partial least squares (PLS) instead of principal component analysis (PCA) which is normally used for detecting multivariate outliers. A simple modification to the Mahalanobis distance was also proposed which appears to be more sensitive to outliers than the conventional Mahalanobis distance. The methodology also introduces the concept of a desirability function to enable automatic decision making based on multiple statistical measures for outlier detection. The methodology is demonstrated using Raman spectroscopy data collected from an industrial distillation process.
Original languageEnglish
Pages (from-to)40-49
Number of pages10
JournalControl Engineering Practice
Volume70
Early online date18 Oct 2017
DOIs
Publication statusPublished - 31 Jan 2018

Fingerprint

Outlier Detection
Multivariate Data
Multivariate Outliers
Mahalanobis Distance
Process Monitoring
Process monitoring
Calibration
Process Control
Outlier
Process control
Methodology
Desirability Function
Distillation
Model Calibration
Raman Spectroscopy
Partial Least Squares
Latent Variables
Petrochemicals
Principal component analysis
Principal Component Analysis

Keywords

  • multivariate outliers
  • Mahalanobis distance
  • outlier detection
  • desirability function
  • multivariate trimming

Cite this

Thennadil, Suresh N. ; Dewar, Mark ; Herdsman, Craig ; Nordon, Alison ; Becker, Edo. / Automated weighted outlier detection technique for multivariate data. In: Control Engineering Practice. 2018 ; Vol. 70. pp. 40-49.
@article{cb57d5f2674e47bebde987c52210aa25,
title = "Automated weighted outlier detection technique for multivariate data",
abstract = "In the chemical and petrochemical industries, spectroscopy-based online analysers are becoming common for process monitoring and control applications. A significant challenge in using these analysers as part of process monitoring and control loops is the large amount of personnel time required for calibration and maintenance of models which involve decision inputs such as whether an observation is an outlier, the number of latent variables in a model, type of pre-processing and when a calibration model has to be updated. Since no one measure works well for all applications, supervision by the process data analyst is required which invariably involves some level of subjectivity. In this paper, we focus on the detection of multivariate outliers in a calibration set. We propose a method which combines multiple outlier detection techniques to identify a set of outlying observations without operator input. Apart from the overall methodology, this work introduces several novelties. The system uses partial least squares (PLS) instead of principal component analysis (PCA) which is normally used for detecting multivariate outliers. A simple modification to the Mahalanobis distance was also proposed which appears to be more sensitive to outliers than the conventional Mahalanobis distance. The methodology also introduces the concept of a desirability function to enable automatic decision making based on multiple statistical measures for outlier detection. The methodology is demonstrated using Raman spectroscopy data collected from an industrial distillation process.",
keywords = "multivariate outliers, Mahalanobis distance, outlier detection, desirability function, multivariate trimming",
author = "Thennadil, {Suresh N.} and Mark Dewar and Craig Herdsman and Alison Nordon and Edo Becker",
year = "2018",
month = "1",
day = "31",
doi = "10.1016/j.conengprac.2017.09.018",
language = "English",
volume = "70",
pages = "40--49",
journal = "Control Engineering Practice",
issn = "0967-0661",

}

Automated weighted outlier detection technique for multivariate data. / Thennadil, Suresh N.; Dewar, Mark; Herdsman, Craig; Nordon, Alison; Becker, Edo.

In: Control Engineering Practice, Vol. 70, 31.01.2018, p. 40-49.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Automated weighted outlier detection technique for multivariate data

AU - Thennadil, Suresh N.

AU - Dewar, Mark

AU - Herdsman, Craig

AU - Nordon, Alison

AU - Becker, Edo

PY - 2018/1/31

Y1 - 2018/1/31

N2 - In the chemical and petrochemical industries, spectroscopy-based online analysers are becoming common for process monitoring and control applications. A significant challenge in using these analysers as part of process monitoring and control loops is the large amount of personnel time required for calibration and maintenance of models which involve decision inputs such as whether an observation is an outlier, the number of latent variables in a model, type of pre-processing and when a calibration model has to be updated. Since no one measure works well for all applications, supervision by the process data analyst is required which invariably involves some level of subjectivity. In this paper, we focus on the detection of multivariate outliers in a calibration set. We propose a method which combines multiple outlier detection techniques to identify a set of outlying observations without operator input. Apart from the overall methodology, this work introduces several novelties. The system uses partial least squares (PLS) instead of principal component analysis (PCA) which is normally used for detecting multivariate outliers. A simple modification to the Mahalanobis distance was also proposed which appears to be more sensitive to outliers than the conventional Mahalanobis distance. The methodology also introduces the concept of a desirability function to enable automatic decision making based on multiple statistical measures for outlier detection. The methodology is demonstrated using Raman spectroscopy data collected from an industrial distillation process.

AB - In the chemical and petrochemical industries, spectroscopy-based online analysers are becoming common for process monitoring and control applications. A significant challenge in using these analysers as part of process monitoring and control loops is the large amount of personnel time required for calibration and maintenance of models which involve decision inputs such as whether an observation is an outlier, the number of latent variables in a model, type of pre-processing and when a calibration model has to be updated. Since no one measure works well for all applications, supervision by the process data analyst is required which invariably involves some level of subjectivity. In this paper, we focus on the detection of multivariate outliers in a calibration set. We propose a method which combines multiple outlier detection techniques to identify a set of outlying observations without operator input. Apart from the overall methodology, this work introduces several novelties. The system uses partial least squares (PLS) instead of principal component analysis (PCA) which is normally used for detecting multivariate outliers. A simple modification to the Mahalanobis distance was also proposed which appears to be more sensitive to outliers than the conventional Mahalanobis distance. The methodology also introduces the concept of a desirability function to enable automatic decision making based on multiple statistical measures for outlier detection. The methodology is demonstrated using Raman spectroscopy data collected from an industrial distillation process.

KW - multivariate outliers

KW - Mahalanobis distance

KW - outlier detection

KW - desirability function

KW - multivariate trimming

UR - http://www.sciencedirect.com/science/journal/09670661?sdc=1

U2 - 10.1016/j.conengprac.2017.09.018

DO - 10.1016/j.conengprac.2017.09.018

M3 - Article

VL - 70

SP - 40

EP - 49

JO - Control Engineering Practice

JF - Control Engineering Practice

SN - 0967-0661

ER -