FN Archimer Export Format PT J TI Multivariate Cutoff Level Analysis (MultiCoLA) of large community data sets BT AF GOBET, Angelique QUINCE, Christopher RAMETTE, Alban AS 1:1,2;2:3;3:1; FF 1:;2:;3:; C1 Max Planck Inst Marine Microbiol, Microbial Habitat Grp, Bremen, Germany. Jacobs Univ Bremen GmbH, Bremen, Germany. Univ Glasgow, Dept Civil Engn, Glasgow G12 8QQ, Lanark, Scotland. C2 INST MAX PLANCK, GERMANY UNIV BREMEN, GERMANY UNIV GLASGOW, UK IF 7.836 TC 79 UR https://archimer.ifremer.fr/doc/00488/59949/63199.pdf https://archimer.ifremer.fr/doc/00488/59949/63200.pdf https://archimer.ifremer.fr/doc/00488/59949/63201.pdf LA English DT Article AB High-throughput sequencing techniques are becoming attractive to molecular biologists and ecologists as they provide a time- and cost-effective way to explore diversity patterns in environmental samples at an unprecedented resolution. An issue common to many studies is the definition of what fractions of a data set should be considered as rare or dominant. Yet this question has neither been satisfactorily addressed, nor is the impact of such definition on data set structure and interpretation been fully evaluated. Here we propose a strategy, MultiCoLA (Multivariate Cutoff Level Analysis), to systematically assess the impact of various abundance or rarity cutoff levels on the resulting data set structure and on the consistency of the further ecological interpretation. We applied MultiCoLA to a 454 massively parallel tag sequencing data set of V6 ribosomal sequences from marine microbes in temperate coastal sands. Consistent ecological patterns were maintained after removing up to 35-40% rare sequences and similar patterns of beta diversity were observed after denoising the data set by using a preclustering algorithm of 454 flowgrams. This example validates the importance of exploring the impact of the definition of rarity in large community data sets. Future applications can be foreseen for data sets from different types of habitats, e.g. other marine environments, soil and human microbiota. PY 2010 PD AUG SO Nucleic Acids Research SN 0305-1048 PU Oxford Univ Press VL 38 IS 15 UT 000281345900004 DI 10.1093/nar/gkq545 ID 59949 ER EF