THAPBI PICT - a fast, cautious, and accurate metabarcoding analysis pipeline

Peter Cock, David E. L. Cooke, Peter Thorpe, Leighton Pritchard

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)
54 Downloads (Pure)

Abstract

THAPBI PICT is an open source software pipeline for metabarcoding analysis of Illumina paired-end reads, including cases of multiplexing where more than one amplicon is amplified per DNA sample. Initially a Phytophthora ITS1 Classification Tool (PICT), we demonstrate using worked examples with our own and public data sets how, with appropriate primer settings and a custom database, it can be applied to other amplicons and organisms, and used for reanalysis of existing datasets. The core dataflow of the implementation is (i) data reduction to unique marker sequences, often called amplicon sequence variants (ASVs), (ii) dynamic thresholds for discarding low abundance sequences to remove noise and artifacts (rather than error correction by default), before (iii) classification using a curated reference database. The default classifier assigns a label to each query sequence based on a database match that is either perfect, or a single base pair edit away (substitution, deletion or insertion). Abundance thresholds for inclusion can be set by the user or automatically using per-batch negative or synthetic control samples. Output is designed for practical interpretation by non-specialists and includes a read report (ASVs with classification and counts per sample), sample report (samples with counts per species classification), and a topological graph of ASVs as nodes with short edit distances as edges. Source code available from https://github.com/peterjc/thapbi-pict/ with documentation including installation instructions.
Original languageEnglish
Article numbere15648
Number of pages17
JournalPeerJ
Volume11
DOIs
Publication statusPublished - 18 Aug 2023

Funding

The following grant information was disclosed by the authors: Biotechnology and Biological Sciences Research Council (BBSRC). Department for Environment, Food and Rural affairs (DEFRA). Economic and Social Research Council (ESRC). Forestry Commission, Natural Environment Research Council (NERC). Scottish Government, under the Tree Health and Plant Biosecurity Initiative: BB/ N023463/1. DEFRA. Euphresco ID-PHYT. Rural & Environment Science & Analytical Services (RESAS). Division of the Scottish Government. This research was supported by a grant funded jointly by the Biotechnology and Biological Sciences Research Council (BBSRC), Department for Environment, Food and Rural affairs (DEFRA), Economic and Social Research Council (ESRC), Forestry Commission, Natural Environment Research Council (NERC) and Scottish Government, under the Tree Health and Plant Biosecurity Initiative, grant number BB/N023463/1. Also partly funded by DEFRA as part of the Future Proofing Plant Health project in support of Euphresco ID-PHYT, and by the Rural & Environment Science & Analytical Services (RESAS) Division of the Scottish Government. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Keywords

  • open source software
  • metabardoding
  • DNA

Fingerprint

Dive into the research topics of 'THAPBI PICT - a fast, cautious, and accurate metabarcoding analysis pipeline'. Together they form a unique fingerprint.

Cite this