A machine-learning approach to modeling picophytoplankton abundances in the South China Sea

Bingzhang Chen, Hongbin Liu, Wupeng Xiao, Lei Wang, Bangqin Huang

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)
8 Downloads (Pure)


Picophytoplankton, the smallest phytoplankton (<3 micron), contribute significantly to primary production in the oligotrophic South China Sea. To improve our ability to predict picophytoplankton abundances in the South China Sea and infer the underlying mechanisms, we compared four machine learning algorithms to estimate the horizontal and vertical distributions of picophytoplankton abundances. The inputs of the algorithms include spatiotemporal (longitude, latitude, sampling depth and date) and environmental variables (sea surface temperature, chlorophyll, and light). The algorithms were fit to a dataset of 2442 samples collected from 2006 to 2012. We find that the Boosted Regression Trees (BRT) gives the best prediction performance with R2 ranging from 77% to 85% for Chl a concentration and abundances of three picophytoplankton groups. The model outputs confirm that temperature and light play important roles in affecting picophytoplankton distribution. Prochlorococcus, Synechococcus, and picoeukaryotes show decreasing preference to oligotrophy. These insights are reflected in the vertical patterns of Chl a and picoeukaryotes that form subsurface maximal layers in summer and spring, contrasting with those of Prochlorococcus and Synechococcus that are most abundant at surface. Our forecasts suggest that, under the “business-as-usual” scenario, total Chl a will decrease but Prochlorococcus abundances will increase significantly to the end of this century. Synechococcus abundances will also increase, but the trend is only significant in coastal waters. Our study has advanced the ability of predicting picophytoplankton abundances in the South China Sea and suggests that BRT is a useful machine learning technique for modelling plankton distribution.
Original languageEnglish
Article number102456
Number of pages15
JournalProgress in Oceanography
Early online date16 Oct 2020
Publication statusPublished - 31 Dec 2020


  • Prochlorococcus
  • chlorophyll a
  • random forest
  • Synechococcus
  • South China Sea
  • boosted regression tree
  • Generalized Additive Models
  • neural network


Dive into the research topics of 'A machine-learning approach to modeling picophytoplankton abundances in the South China Sea'. Together they form a unique fingerprint.

Cite this