Benchmarking bioinformatic tools for fast and accurate eDNA metabarcoding species identification
Type | Article | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Date | 2021-10 | ||||||||||||||||||||||||
Language | English | ||||||||||||||||||||||||
Author(s) | Mathon Laetitia![]() ![]() ![]() |
||||||||||||||||||||||||
Affiliation(s) | 1 : CEFE, Univ. Montpellier, CNRS, EPHE‐PSL University, IRD, Univ Paul Valéry Montpellier 3 Montpellier ,France 2 : SPYGEN, 17 rue du Lac Saint‐André, Savoie Technolac 73370 Le Bourget du Lac, France 3 : Université Laval IBIS (Institut de Biologie Intégrative et des Systèmes) 1030 av. de la Médecine Québec QC G1V 0A6 ,Canada 4 : IFREMER ‐ IRSI ‐ Service de Bioinformatique (SeBiMER) 29280 Plouzané ,France 5 : Univ. Grenoble Alpes, Univ. Savoie Mont Blanc, CNRS, LECA, Laboratoire d’Écologie Alpine F‐ 38000 Grenoble ,France 6 : MARBEC, Univ. Montpellier,CNRS, IRD, Ifremer Montpellier ,France 7 : Institut Universitaire de France IUF Paris 75231, France |
||||||||||||||||||||||||
Source | Molecular Ecology Resources (1755-098X) (Wiley), 2021-10 , Vol. 21 , N. 7 , P. 2565-2579 | ||||||||||||||||||||||||
DOI | 10.1111/1755-0998.13430 | ||||||||||||||||||||||||
WOS© Times Cited | 12 | ||||||||||||||||||||||||
Keyword(s) | benchmark, bioinformatics, eDNA, metabarcoding, sensitivity, species identification | ||||||||||||||||||||||||
Abstract | Bioinformatic analysis of eDNA metabarcoding data is crucial toward rigorously assessing biodiversity. Many programs are now available for each step of the required analyses, but their relative abilities at providing fast and accurate species lists have seldom been evaluated. We used simulated mock communities and real fish eDNA metabarcoding data to evaluate the performance of 13 bioinformatic programs and pipelines to retrieve fish occurrence and read abundance using the 12S mt rRNA gene marker. We used four indices to compare the outputs of each program with the simulated samples: sensitivity, F-measure, root-mean-square error (RMSE) on read relative abundances, and execution time. We found marked differences among programs only for the taxonomic assignment step, both in terms of sensitivity, F-measure and RMSE. Running time was highly different between programs for each step. The fastest programs with best indices for each step were assembled into a pipeline. We compare this pipeline to pipelines constructed from existing toolboxes (OBITools, Barque, and QIIME 2). Our pipeline and Barque obtained the best performance for all indices and appear to be better alternatives to highly used pipelines for analyzing fish eDNA metabarcoding data with a complete reference database. Real eDNA metabarcoding data also indicated differences for taxonomic assignment and execution time only. This study reveals major differences between programs during the taxonomic assignment step. The choice of algorithm for the taxonomic assignment can have a significant impact on diversity estimates and should be made according to the objectives of the study. |
||||||||||||||||||||||||
Full Text |
|