A cost focused framework for optimizing collection and annotation of ultrasound datasets

Alistair Lawley*, Rory Hampson, Kevin Worrall, Gordon Dobie

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

20 Downloads (Pure)

Abstract

Machine learning for medical ultrasound imaging encounters a major challenge: the prohibitive costs of producing and annotating clinical data. The issue of cost vs size is well understood in the context of clinical trials. These same methods can be applied to optimize the data collection and annotation process, ultimately reducing machine learning project cost and times in feasibility studies. This paper presents a two-phase framework for quantifying the cost of data collection using iterative accuracy/sample size predictions and active learning to guide/optimize full human annotation in medical ultrasound imaging for machine learning purposes. The paper demonstrated potential cost reductions using public breast, fetal, and lung ultrasound datasets and a practical case study on Breast Ultrasound. The results show that just as with clinical trials, the relationship between dataset size and final accuracy can be predicted, with the majority of accuracy improvements occurring using only 40-50% of the data dependent on tolerance measure. Manual annotation can be reduced further using active learning, resulting in a representative cost reduction of 66% with a tolerance measure of around 4% accuracy drop from theoretical maximums. The significance of this work lies in its ability to quantify how much additional data and annotation will be required to achieve a specific research objective. These methods are already well understood by clinical funders and so provide a valuable and effective framework for feasibility and pilot studies where machine learning will be applied within a fixed budget to maximize predictive gains, informing resourcing and further clinical study.
Original languageEnglish
Article number106048
Number of pages13
JournalBiomedical Signal Processing and Control
Volume92
Early online date7 Feb 2024
DOIs
Publication statusPublished - 30 Jun 2024

Funding

This work was supported by a UK Engineering and Physical Sciences Research Council (EPSRC) Future Ultrasonic Engineering Center for Doctoral Training (FUSE CDT) under grant EP/S023879/1 and 2296317.

Keywords

  • ultrasound
  • medical imaging
  • cost effectiveness
  • active learning
  • deep learning

Fingerprint

Dive into the research topics of 'A cost focused framework for optimizing collection and annotation of ultrasound datasets'. Together they form a unique fingerprint.

Cite this