TY - JOUR
T1 - A variable elimination method to improve the parsimony of MLR models using the successive projections algorithm
AU - Harrop Galvao, Roberto Kawakami
AU - Ugulino Araujo, Mario Cesar
AU - Fragoso, Wallace Duarte
AU - Silva, Edvan Cirino
AU - Jose, Gledson Emidio
AU - Carreiro Soares, Sofacles Figueredo
AU - Paiva, Henrique Mohallem
PY - 2008/5/15
Y1 - 2008/5/15
N2 - The successive projections algorithm (SPA) is a variable selection technique designed to minimize collinearity problems in multiple linear regression (MLR). This paper proposes a modification to the basic SPA formulation aimed at further improving the parsimony of the resulting MLR model. For this purpose, an elimination procedure is incorporated to the algorithm in order to remove variables that do not effectively contribute towards the prediction ability of the model as indicated by an F-test. The utility of the proposed modification is illustrated in a simulation study, as well as in two application examples involving the analysis of diesel and com samples by near-infrared (NIR) spectroscopy. The results demonstrate that the number of variables selected by SPA can be reduced without significantly compromising prediction performance. In addition, SPA is favourably compared with classic Stepwise Regression and full-spectrum PLS. A graphical user interface for SPA is available at www.ele.ita.br/similar to kawakami/spa/. (C) 2008 Elsevier B.V. All rights reserved.
AB - The successive projections algorithm (SPA) is a variable selection technique designed to minimize collinearity problems in multiple linear regression (MLR). This paper proposes a modification to the basic SPA formulation aimed at further improving the parsimony of the resulting MLR model. For this purpose, an elimination procedure is incorporated to the algorithm in order to remove variables that do not effectively contribute towards the prediction ability of the model as indicated by an F-test. The utility of the proposed modification is illustrated in a simulation study, as well as in two application examples involving the analysis of diesel and com samples by near-infrared (NIR) spectroscopy. The results demonstrate that the number of variables selected by SPA can be reduced without significantly compromising prediction performance. In addition, SPA is favourably compared with classic Stepwise Regression and full-spectrum PLS. A graphical user interface for SPA is available at www.ele.ita.br/similar to kawakami/spa/. (C) 2008 Elsevier B.V. All rights reserved.
KW - multiple linear regression
KW - variable selection
KW - successive projections algorithm
KW - near-infrared spectrometry
KW - diesel analysis
KW - com analysis
KW - multivariate calibration
KW - selection
KW - prediction
U2 - 10.1016/j.chemolab.2007.12.004
DO - 10.1016/j.chemolab.2007.12.004
M3 - Article
SN - 0169-7439
VL - 92
SP - 83
EP - 91
JO - Chemometrics and Intelligent Laboratory Systems
JF - Chemometrics and Intelligent Laboratory Systems
IS - 1
ER -