Investigation of metabolomics techniques by analysis of MS propolis data: which pre-treatment method is better?

Research output: Contribution to journalArticle

4 Downloads (Pure)

Abstract

Metabolomics data usually undergoes both pre-processing of the raw data and then further pre-treatment before any statistical analysis is carried out. Different pre-treatment methods emphasise various aspects of the data, and each method has advantages and disadvantages. The choice of pre-treatment method depends on the biological question of interest, characteristics of the data and the chosen data analysis. In this paper, we investigate the effects of different pre-treatment methods on four metabolomics data sets arising from chemical analysis of propolis samples collected from honey bee colonies in three different locations in Scotland, and also samples from Libya. Propolis has a variety of biological properties including anti-protozoal and anti-inflammatory effects. As a complex mixture, its biological activity depends on its exact composition, which can be investigated via metabolomic analysis. Two techniques of pre-treatment were applied, namely, transformation and scaling. The choice of method was found to greatly affect the results of the principal component analysis (PCA) used to explain the variation in the data. The results indicated that there was no notable (if any) improvement to be made by using any transformation techniques. It was also found for all four data sets that Pareto scaling, incorporating mean centring, performed better than the other scaling approaches considered here in terms of PCA, the analysis of interest, because the results explain more of the variation in the data.
Original languageEnglish
Pages (from-to)13-34
Number of pages22
JournalAdvances and Applications in Statistics
Volume58
Issue number1
DOIs
Publication statusPublished - 30 Sep 2019

Keywords

  • metabolomics data
  • propolis
  • pre-treatment
  • principal component analysis (PCA)
  • transformation
  • centring
  • standardisation
  • vast scaling
  • Pareto scaling
  • range scaling
  • level scaling

Cite this