Automated classification of schools of the Silver Cyprinid Rastrineobola argentea in Lake Victoria acoustic survey data using random forests

Roland Proud, Richard Mangeni-Sande, Robert J. Kayanda, Martin J. Cox, Chrisphine Nyamweya, Collins Ongore, Vianny Natugonza, Inigo Everson, Mboni Elison, Laura Hobbs, Benedicto Boniphace Kashindye, Enock W. Mlaponi, Anthony Taabu-Munyaho, Venny M. Mwainge, Esther Kagoya, Antonio Pegado, Evarist Nduwayesu, Andrew S. Brierley

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)
7 Downloads (Pure)


Biomass of the schooling fish Rastrineobola argentea (dagaa) is presently estimated in Lake Victoria by acoustic survey following the simple “rule” that dagaa is the source of most echo energy returned from the top third of the water column. Dagaa have, however, been caught in the bottom two-thirds, and other species occur towards the surface: a more robust discrimination technique is required. We explored the utility of a school-based random forest (RF) classifier applied to 120 kHz data from a lake-wide survey. Dagaa schools were first identified manually using expert opinion informed by fishing. These schools contained a lake-wide biomass of 0.68 million tonnes (MT). Only 43.4% of identified dagaa schools occurred in the top third of the water column, and 37.3% of all schools in the bottom two-thirds were classified as dagaa. School metrics (e.g. length, echo energy) for 49 081 manually classified dagaa and non-dagaa schools were used to build an RF school classifier. The best RF model had a classification test accuracy of 85.4%, driven largely by school length, and yielded a biomass of 0.71 MT, only c. 4% different from the manual estimate. The RF classifier offers an efficient method to generate a consistent dagaa biomass time series.

Original languageEnglish
Pages (from-to)1379-1390
Number of pages12
JournalICES Journal of Marine Science
Issue number4
Early online date9 May 2020
Publication statusPublished - 1 Jul 2020


  • artificial intelligence
  • big data
  • dagaa
  • Lake Victoria
  • machine learning
  • school analysis
  • species identification
  • stock assessment
  • Rastrineobola argentea

Cite this