Investigation of metabolomics techniques by analysis of MS propolis data: which pre-treatment method is better?

Research output: Contribution to journalArticle

Abstract

Metabolomics data usually undergoes both pre-processing of the raw data and then further pre-treatment before any statistical analysis is carried out. Different pre-treatment methods emphasise various aspects of the data, and each method has advantages and disadvantages. The choice of pre-treatment method depends on the biological question of interest, characteristics of the data and the chosen data analysis. In this paper, we investigate the effects of different pre-treatment methods on four metabolomics data sets arising from chemical analysis of propolis samples collected from honey bee colonies in three different locations in Scotland, and also samples from Libya. Propolis has a variety of biological properties including anti-protozoal and anti-inflammatory effects. As a complex mixture, its biological activity depends on its exact composition, which can be investigated via metabolomic analysis. Two techniques of pre-treatment were applied, namely, transformation and scaling. The choice of method was found to greatly affect the results of the principal component analysis (PCA) used to explain the variation in the data. The results indicated that there was no notable (if any) improvement to be made by using any transformation techniques. It was also found for all four data sets that Pareto scaling, incorporating mean centring, performed better than the other scaling approaches considered here in terms of PCA, the analysis of interest, because the results explain more of the variation in the data.
LanguageEnglish
Pages13-34
Number of pages22
JournalAdvances and Applications in Statistics
Volume58
Issue number1
DOIs
Publication statusPublished - 30 Sep 2019

Fingerprint

Metabolomics
Scaling
Principal Component Analysis
Chemical Analysis
Pareto Set
Statistical Analysis
Preprocessing
Data analysis

Keywords

  • metabolomics data
  • propolis
  • pre-treatment
  • principal component analysis (PCA)
  • transformation
  • centring
  • standardisation
  • vast scaling
  • Pareto scaling
  • range scaling
  • level scaling

Cite this

@article{e528714535a34b8189e7a226763bb8e0,
title = "Investigation of metabolomics techniques by analysis of MS propolis data: which pre-treatment method is better?",
abstract = "Metabolomics data usually undergoes both pre-processing of the raw data and then further pre-treatment before any statistical analysis is carried out. Different pre-treatment methods emphasise various aspects of the data, and each method has advantages and disadvantages. The choice of pre-treatment method depends on the biological question of interest, characteristics of the data and the chosen data analysis. In this paper, we investigate the effects of different pre-treatment methods on four metabolomics data sets arising from chemical analysis of propolis samples collected from honey bee colonies in three different locations in Scotland, and also samples from Libya. Propolis has a variety of biological properties including anti-protozoal and anti-inflammatory effects. As a complex mixture, its biological activity depends on its exact composition, which can be investigated via metabolomic analysis. Two techniques of pre-treatment were applied, namely, transformation and scaling. The choice of method was found to greatly affect the results of the principal component analysis (PCA) used to explain the variation in the data. The results indicated that there was no notable (if any) improvement to be made by using any transformation techniques. It was also found for all four data sets that Pareto scaling, incorporating mean centring, performed better than the other scaling approaches considered here in terms of PCA, the analysis of interest, because the results explain more of the variation in the data.",
keywords = "metabolomics data, propolis, pre-treatment, principal component analysis (PCA), transformation, centring, standardisation, vast scaling, Pareto scaling, range scaling, level scaling",
author = "Abdulaziz Alghamdi and Alison Gray and David Watson",
year = "2019",
month = "9",
day = "30",
doi = "10.17654/AS058010013",
language = "English",
volume = "58",
pages = "13--34",
journal = "Advances and Applications in Statistics",
issn = "0972-3617",
number = "1",

}

TY - JOUR

T1 - Investigation of metabolomics techniques by analysis of MS propolis data

T2 - Advances and Applications in Statistics

AU - Alghamdi, Abdulaziz

AU - Gray, Alison

AU - Watson, David

PY - 2019/9/30

Y1 - 2019/9/30

N2 - Metabolomics data usually undergoes both pre-processing of the raw data and then further pre-treatment before any statistical analysis is carried out. Different pre-treatment methods emphasise various aspects of the data, and each method has advantages and disadvantages. The choice of pre-treatment method depends on the biological question of interest, characteristics of the data and the chosen data analysis. In this paper, we investigate the effects of different pre-treatment methods on four metabolomics data sets arising from chemical analysis of propolis samples collected from honey bee colonies in three different locations in Scotland, and also samples from Libya. Propolis has a variety of biological properties including anti-protozoal and anti-inflammatory effects. As a complex mixture, its biological activity depends on its exact composition, which can be investigated via metabolomic analysis. Two techniques of pre-treatment were applied, namely, transformation and scaling. The choice of method was found to greatly affect the results of the principal component analysis (PCA) used to explain the variation in the data. The results indicated that there was no notable (if any) improvement to be made by using any transformation techniques. It was also found for all four data sets that Pareto scaling, incorporating mean centring, performed better than the other scaling approaches considered here in terms of PCA, the analysis of interest, because the results explain more of the variation in the data.

AB - Metabolomics data usually undergoes both pre-processing of the raw data and then further pre-treatment before any statistical analysis is carried out. Different pre-treatment methods emphasise various aspects of the data, and each method has advantages and disadvantages. The choice of pre-treatment method depends on the biological question of interest, characteristics of the data and the chosen data analysis. In this paper, we investigate the effects of different pre-treatment methods on four metabolomics data sets arising from chemical analysis of propolis samples collected from honey bee colonies in three different locations in Scotland, and also samples from Libya. Propolis has a variety of biological properties including anti-protozoal and anti-inflammatory effects. As a complex mixture, its biological activity depends on its exact composition, which can be investigated via metabolomic analysis. Two techniques of pre-treatment were applied, namely, transformation and scaling. The choice of method was found to greatly affect the results of the principal component analysis (PCA) used to explain the variation in the data. The results indicated that there was no notable (if any) improvement to be made by using any transformation techniques. It was also found for all four data sets that Pareto scaling, incorporating mean centring, performed better than the other scaling approaches considered here in terms of PCA, the analysis of interest, because the results explain more of the variation in the data.

KW - metabolomics data

KW - propolis

KW - pre-treatment

KW - principal component analysis (PCA)

KW - transformation

KW - centring

KW - standardisation

KW - vast scaling

KW - Pareto scaling

KW - range scaling

KW - level scaling

U2 - 10.17654/AS058010013

DO - 10.17654/AS058010013

M3 - Article

VL - 58

SP - 13

EP - 34

JO - Advances and Applications in Statistics

JF - Advances and Applications in Statistics

SN - 0972-3617

IS - 1

ER -