Bayesian modelling and quantification of Raman spectroscopy

Matthew Moores, Kirsten Gracie, Jake Carson, Karen Faulds, Duncan Graham, Mark Girolami

Research output: Contribution to journalArticlepeer-review

43 Downloads (Pure)


Raman spectroscopy can be used to identify molecules such as DNA by the characteristic scattering of light from a laser. It is sensitive at very low concentrations and can accurately quantify the amount of a given molecule in a sample. The presence of a large, nonuniform background presents a major challenge to analysis of these spectra. To overcome this challenge, we introduce a sequential Monte Carlo (SMC) algorithm to separate each observed spectrum into a series of peaks plus a smoothly-varying baseline, corrupted by additive white noise. The peaks are modelled as Lorentzian, Gaussian, or pseudo-Voigt functions, while the baseline is estimated using a penalised cubic spline. This latent continuous representation accounts for differences in resolution between measurements. The posterior distribution can be incrementally updated as more data becomes available, resulting in a scalable algorithm that is robust to local maxima. By incorporating this representation in a Bayesian hierarchical regression model, we can quantify the relationship between molecular concentration and peak intensity, thereby providing an improved estimate of the limit of detection, which is of major importance to analytical chemistry.
Original languageEnglish
JournalAnnals of Applied Statistics
Publication statusAccepted/In press - 24 Jan 2018


  • chemometrics
  • functional data analysis
  • multivariate calibration
  • nanotechnology
  • sequential Monte Carlo

Fingerprint Dive into the research topics of 'Bayesian modelling and quantification of Raman spectroscopy'. Together they form a unique fingerprint.

Cite this