A method for calibration and validation subset partitioning

R K H Galvao, M C U Araujo, G E Jose, M J C Pontes, E C Silva, T C B Saldanha

Research output: Contribution to journalArticle

430 Citations (Scopus)

Abstract

This paper proposes a new method to divide a pool of samples into calibration and validation subsets for multivariate modelling. The proposed method is of value for analytical applications involving complex matrices, in which the composition variability of real samples cannot be easily reproduced by optimized experimental designs. A stepwise procedure is employed to select samples according to their differences in both x (instrumental responses) and y (predicted parameter) spaces. The proposed technique is illustrated in a case study involving the prediction of three quality parameters (specific mass and distillation temperatures at which 10 and 90% of the sample has evaporated) of diesel by NIR spectrometry and PLS modelling. For comparison, PLS models are also constructed by full cross-validation, as well as by using the Kennard-Stone and random sampling methods for calibration and validation subset partitioning. The obtained models are compared in terms of prediction performance by employing an independent set of samples not used for calibration or validation. The results of F-tests at 95% confidence level reveal that the proposed technique may be an advantageous alternative to the other three strategies. (c) 2005 Elsevier B.V. All rights reserved.

LanguageEnglish
Pages736-740
Number of pages5
JournalTalanta
Volume67
Issue number4
DOIs
Publication statusPublished - 15 Oct 2005

Fingerprint

Calibration
Set theory
Distillation
Design of experiments
Spectrometry
Sampling
Spectrum Analysis
Research Design
Chemical analysis
Temperature

Keywords

  • sample subset partitioning
  • PLS regression
  • Kennard-Stone algorithm
  • NIR spectrometry
  • diesel analysis
  • artificial neural network
  • multivariate calibration
  • DESIGN
  • FUEL

Cite this

Galvao, R. K. H., Araujo, M. C. U., Jose, G. E., Pontes, M. J. C., Silva, E. C., & Saldanha, T. C. B. (2005). A method for calibration and validation subset partitioning. Talanta, 67(4), 736-740. https://doi.org/10.1016/j.talanta.2005.03.025
Galvao, R K H ; Araujo, M C U ; Jose, G E ; Pontes, M J C ; Silva, E C ; Saldanha, T C B . / A method for calibration and validation subset partitioning. In: Talanta. 2005 ; Vol. 67, No. 4. pp. 736-740.
@article{65177819f05045d29892914c68f661cc,
title = "A method for calibration and validation subset partitioning",
abstract = "This paper proposes a new method to divide a pool of samples into calibration and validation subsets for multivariate modelling. The proposed method is of value for analytical applications involving complex matrices, in which the composition variability of real samples cannot be easily reproduced by optimized experimental designs. A stepwise procedure is employed to select samples according to their differences in both x (instrumental responses) and y (predicted parameter) spaces. The proposed technique is illustrated in a case study involving the prediction of three quality parameters (specific mass and distillation temperatures at which 10 and 90{\%} of the sample has evaporated) of diesel by NIR spectrometry and PLS modelling. For comparison, PLS models are also constructed by full cross-validation, as well as by using the Kennard-Stone and random sampling methods for calibration and validation subset partitioning. The obtained models are compared in terms of prediction performance by employing an independent set of samples not used for calibration or validation. The results of F-tests at 95{\%} confidence level reveal that the proposed technique may be an advantageous alternative to the other three strategies. (c) 2005 Elsevier B.V. All rights reserved.",
keywords = "sample subset partitioning, PLS regression, Kennard-Stone algorithm, NIR spectrometry, diesel analysis, artificial neural network, multivariate calibration, DESIGN, FUEL",
author = "Galvao, {R K H} and Araujo, {M C U} and Jose, {G E} and Pontes, {M J C} and Silva, {E C} and Saldanha, {T C B}",
year = "2005",
month = "10",
day = "15",
doi = "10.1016/j.talanta.2005.03.025",
language = "English",
volume = "67",
pages = "736--740",
journal = "Talanta",
issn = "0039-9140",
number = "4",

}

Galvao, RKH, Araujo, MCU, Jose, GE, Pontes, MJC, Silva, EC & Saldanha, TCB 2005, 'A method for calibration and validation subset partitioning' Talanta, vol. 67, no. 4, pp. 736-740. https://doi.org/10.1016/j.talanta.2005.03.025

A method for calibration and validation subset partitioning. / Galvao, R K H ; Araujo, M C U ; Jose, G E ; Pontes, M J C ; Silva, E C ; Saldanha, T C B .

In: Talanta, Vol. 67, No. 4, 15.10.2005, p. 736-740.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A method for calibration and validation subset partitioning

AU - Galvao, R K H

AU - Araujo, M C U

AU - Jose, G E

AU - Pontes, M J C

AU - Silva, E C

AU - Saldanha, T C B

PY - 2005/10/15

Y1 - 2005/10/15

N2 - This paper proposes a new method to divide a pool of samples into calibration and validation subsets for multivariate modelling. The proposed method is of value for analytical applications involving complex matrices, in which the composition variability of real samples cannot be easily reproduced by optimized experimental designs. A stepwise procedure is employed to select samples according to their differences in both x (instrumental responses) and y (predicted parameter) spaces. The proposed technique is illustrated in a case study involving the prediction of three quality parameters (specific mass and distillation temperatures at which 10 and 90% of the sample has evaporated) of diesel by NIR spectrometry and PLS modelling. For comparison, PLS models are also constructed by full cross-validation, as well as by using the Kennard-Stone and random sampling methods for calibration and validation subset partitioning. The obtained models are compared in terms of prediction performance by employing an independent set of samples not used for calibration or validation. The results of F-tests at 95% confidence level reveal that the proposed technique may be an advantageous alternative to the other three strategies. (c) 2005 Elsevier B.V. All rights reserved.

AB - This paper proposes a new method to divide a pool of samples into calibration and validation subsets for multivariate modelling. The proposed method is of value for analytical applications involving complex matrices, in which the composition variability of real samples cannot be easily reproduced by optimized experimental designs. A stepwise procedure is employed to select samples according to their differences in both x (instrumental responses) and y (predicted parameter) spaces. The proposed technique is illustrated in a case study involving the prediction of three quality parameters (specific mass and distillation temperatures at which 10 and 90% of the sample has evaporated) of diesel by NIR spectrometry and PLS modelling. For comparison, PLS models are also constructed by full cross-validation, as well as by using the Kennard-Stone and random sampling methods for calibration and validation subset partitioning. The obtained models are compared in terms of prediction performance by employing an independent set of samples not used for calibration or validation. The results of F-tests at 95% confidence level reveal that the proposed technique may be an advantageous alternative to the other three strategies. (c) 2005 Elsevier B.V. All rights reserved.

KW - sample subset partitioning

KW - PLS regression

KW - Kennard-Stone algorithm

KW - NIR spectrometry

KW - diesel analysis

KW - artificial neural network

KW - multivariate calibration

KW - DESIGN

KW - FUEL

U2 - 10.1016/j.talanta.2005.03.025

DO - 10.1016/j.talanta.2005.03.025

M3 - Article

VL - 67

SP - 736

EP - 740

JO - Talanta

T2 - Talanta

JF - Talanta

SN - 0039-9140

IS - 4

ER -

Galvao RKH, Araujo MCU, Jose GE, Pontes MJC, Silva EC, Saldanha TCB. A method for calibration and validation subset partitioning. Talanta. 2005 Oct 15;67(4):736-740. https://doi.org/10.1016/j.talanta.2005.03.025