A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm

Roberto Kawakami Harrop Galvao, Mario Cesar Ugulino Araujo, Wallace Duarte Fragoso, Edvan Cirino Silva, Gledson Emidio Jose, Sofacles Figueredo Carreiro Soares, Henrique Mohallem Paiva

Research output: Contribution to journalArticle

182 Citations (Scopus)

Abstract

The successive projections algorithm (SPA) is a variable selection technique designed to minimize collinearity problems in multiple linear regression (MLR). This paper proposes a modification to the basic SPA formulation aimed at further improving the parsimony of the resulting MLR model. For this purpose, an elimination procedure is incorporated to the algorithm in order to remove variables that do not effectively contribute towards the prediction ability of the model as indicated by an F-test. The utility of the proposed modification is illustrated in a simulation study, as well as in two application examples involving the analysis of diesel and com samples by near-infrared (NIR) spectroscopy. The results demonstrate that the number of variables selected by SPA can be reduced without significantly compromising prediction performance. In addition, SPA is favourably compared with classic Stepwise Regression and full-spectrum PLS. A graphical user interface for SPA is available at www.ele.ita.br/similar to kawakami/spa/. (C) 2008 Elsevier B.V. All rights reserved.

LanguageEnglish
Pages83-91
Number of pages9
JournalChemometrics and intelligent laboratory systems
Volume92
Issue number1
DOIs
Publication statusPublished - 15 May 2008

Fingerprint

Linear regression
Near infrared spectroscopy
Graphical user interfaces

Keywords

  • multiple linear regression
  • variable selection
  • successive projections algorithm
  • near-infrared spectrometry
  • diesel analysis
  • com analysis
  • multivariate calibration
  • selection
  • prediction

Cite this

Harrop Galvao, R. K., Ugulino Araujo, M. C., Fragoso, W. D., Silva, E. C., Jose, G. E., Carreiro Soares, S. F., & Paiva, H. M. (2008). A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm. Chemometrics and intelligent laboratory systems, 92(1), 83-91. https://doi.org/10.1016/j.chemolab.2007.12.004
Harrop Galvao, Roberto Kawakami ; Ugulino Araujo, Mario Cesar ; Fragoso, Wallace Duarte ; Silva, Edvan Cirino ; Jose, Gledson Emidio ; Carreiro Soares, Sofacles Figueredo ; Paiva, Henrique Mohallem. / A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm. In: Chemometrics and intelligent laboratory systems. 2008 ; Vol. 92, No. 1. pp. 83-91.
@article{a55a67933aee4cbeb44977a8f6424a2e,
title = "A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm",
abstract = "The successive projections algorithm (SPA) is a variable selection technique designed to minimize collinearity problems in multiple linear regression (MLR). This paper proposes a modification to the basic SPA formulation aimed at further improving the parsimony of the resulting MLR model. For this purpose, an elimination procedure is incorporated to the algorithm in order to remove variables that do not effectively contribute towards the prediction ability of the model as indicated by an F-test. The utility of the proposed modification is illustrated in a simulation study, as well as in two application examples involving the analysis of diesel and com samples by near-infrared (NIR) spectroscopy. The results demonstrate that the number of variables selected by SPA can be reduced without significantly compromising prediction performance. In addition, SPA is favourably compared with classic Stepwise Regression and full-spectrum PLS. A graphical user interface for SPA is available at www.ele.ita.br/similar to kawakami/spa/. (C) 2008 Elsevier B.V. All rights reserved.",
keywords = "multiple linear regression, variable selection, successive projections algorithm, near-infrared spectrometry, diesel analysis, com analysis, multivariate calibration, selection, prediction",
author = "{Harrop Galvao}, {Roberto Kawakami} and {Ugulino Araujo}, {Mario Cesar} and Fragoso, {Wallace Duarte} and Silva, {Edvan Cirino} and Jose, {Gledson Emidio} and {Carreiro Soares}, {Sofacles Figueredo} and Paiva, {Henrique Mohallem}",
year = "2008",
month = "5",
day = "15",
doi = "10.1016/j.chemolab.2007.12.004",
language = "English",
volume = "92",
pages = "83--91",
journal = "Chemometrics and intelligent laboratory systems",
issn = "0169-7439",
number = "1",

}

Harrop Galvao, RK, Ugulino Araujo, MC, Fragoso, WD, Silva, EC, Jose, GE, Carreiro Soares, SF & Paiva, HM 2008, 'A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm' Chemometrics and intelligent laboratory systems, vol. 92, no. 1, pp. 83-91. https://doi.org/10.1016/j.chemolab.2007.12.004

A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm. / Harrop Galvao, Roberto Kawakami; Ugulino Araujo, Mario Cesar; Fragoso, Wallace Duarte; Silva, Edvan Cirino; Jose, Gledson Emidio; Carreiro Soares, Sofacles Figueredo; Paiva, Henrique Mohallem.

In: Chemometrics and intelligent laboratory systems, Vol. 92, No. 1, 15.05.2008, p. 83-91.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm

AU - Harrop Galvao, Roberto Kawakami

AU - Ugulino Araujo, Mario Cesar

AU - Fragoso, Wallace Duarte

AU - Silva, Edvan Cirino

AU - Jose, Gledson Emidio

AU - Carreiro Soares, Sofacles Figueredo

AU - Paiva, Henrique Mohallem

PY - 2008/5/15

Y1 - 2008/5/15

N2 - The successive projections algorithm (SPA) is a variable selection technique designed to minimize collinearity problems in multiple linear regression (MLR). This paper proposes a modification to the basic SPA formulation aimed at further improving the parsimony of the resulting MLR model. For this purpose, an elimination procedure is incorporated to the algorithm in order to remove variables that do not effectively contribute towards the prediction ability of the model as indicated by an F-test. The utility of the proposed modification is illustrated in a simulation study, as well as in two application examples involving the analysis of diesel and com samples by near-infrared (NIR) spectroscopy. The results demonstrate that the number of variables selected by SPA can be reduced without significantly compromising prediction performance. In addition, SPA is favourably compared with classic Stepwise Regression and full-spectrum PLS. A graphical user interface for SPA is available at www.ele.ita.br/similar to kawakami/spa/. (C) 2008 Elsevier B.V. All rights reserved.

AB - The successive projections algorithm (SPA) is a variable selection technique designed to minimize collinearity problems in multiple linear regression (MLR). This paper proposes a modification to the basic SPA formulation aimed at further improving the parsimony of the resulting MLR model. For this purpose, an elimination procedure is incorporated to the algorithm in order to remove variables that do not effectively contribute towards the prediction ability of the model as indicated by an F-test. The utility of the proposed modification is illustrated in a simulation study, as well as in two application examples involving the analysis of diesel and com samples by near-infrared (NIR) spectroscopy. The results demonstrate that the number of variables selected by SPA can be reduced without significantly compromising prediction performance. In addition, SPA is favourably compared with classic Stepwise Regression and full-spectrum PLS. A graphical user interface for SPA is available at www.ele.ita.br/similar to kawakami/spa/. (C) 2008 Elsevier B.V. All rights reserved.

KW - multiple linear regression

KW - variable selection

KW - successive projections algorithm

KW - near-infrared spectrometry

KW - diesel analysis

KW - com analysis

KW - multivariate calibration

KW - selection

KW - prediction

U2 - 10.1016/j.chemolab.2007.12.004

DO - 10.1016/j.chemolab.2007.12.004

M3 - Article

VL - 92

SP - 83

EP - 91

JO - Chemometrics and intelligent laboratory systems

T2 - Chemometrics and intelligent laboratory systems

JF - Chemometrics and intelligent laboratory systems

SN - 0169-7439

IS - 1

ER -