Malacological survey in a bottle of water: A comparative study between manual sampling and environmental DNA metabarcoding approaches

To assess the effect of anthropogenic activities on ecosystems, it is of prime importance to develop new tools enabling a rapid characterisation of ecological communities. Freshwater ecosystems are particularly impacted and threatened by human activities and need thorough attention to preserve their biodiversity and the ecological services they provide. Studying such ecosystems is generally difficult because the associated organisms are hard to sample and to monitor. We present a ready to use environmental metabarcoding diagnostic tool to characterise and monitor the freshwater malacofauna from water samples. The efficiency of this new tool was compared to a classical malacological survey at 19 sampled sites from 10 distinct rivers distributed over Corsica Island (France). Our eDNA monitoring tool demonstrated a remarkable ability to reconstitute the local malacofauna compared to the malacological survey, with 97.1% of species detection confirmed by both methods. The present tool successfully detected the 11 freshwater snail species previously reported in Corsica by malacological survey but was limited at the genus level for some species. Moreover, our malacological survey allowed an update of the local distribution of a wide diversity of freshwater snails including invasive species (i.e. Potamopyrgus antipodarum and Physa acuta) as well as snail hosts of pathogens of medical and veterinary importance (i.e. Bulinus truncatus and Galba truncatula).


49
Anthropogenic activities have adverse effects on ecosystems (Parmesan and Yohe, 2003), 50 including a decrease of biodiversity (Waldron et al., 2017), habitat fragmentation (Strayer and 51 Dudgeon, 2010) and pollution (Blettler et al., 2018). Freshwater ecosystems appear to be 52 particularly threatened by human activities leading to local species extinction (Blettler et al., 53 2018). According to the IUCN Red list, 46% of these ecosystems worldwide are endangered 54 or vulnerable (Janssen et al., 2016). However, these ecosystems host 10% of all known 55 species despite representing only 0.8% of the Earth's surface (Strayer and Dudgeon, 2010). 56 Moreover, they also provide important ecological services for the development of human 57 populations (Carpenter et al., 2011). 58 Over freshwater species, gastropods have received particular attention for several 59 reasons. First, they are suffering massive extinction notably due to habitat loss/degradation, 60 the introduction of alien species and, in some cases, overexploitation (Bouchet et al., 1999, 61 Johnson et al., 2013, Lydeard et al., 2004. Second, they play a key role in the functioning of 62 most freshwater habitats. They are involved in the cycle of several biochemicals especially 63 nitrogen (Hill and Griffiths, 2017) and are at the basis of many food webs especially fishes 64 (Dillon, 2000). Freshwater snails are also reliable bio indicators in particular for detecting the 65 presence of environmental organic or inorganic pollutants such as heavy metals, or 66 microorganisms that are accumulated in their tissues (Mahmoud andAbu Taleb, 2013, 67 Tallarico, 2016). Finally, several freshwater gastropods are of medical or veterinary 68 subdivided into four equal parts and extracted following the previously published protocol, 140 see Supp file 2 for details. 141 In parallel, DNA was extracted from one individual of each of the 11 snail species 142 collected during the malacological survey to produce a synthetic "mock community". To this 143 aim, we used the E.Z.N.A.® Tissue DNA kit (OMEGA bio-tek, Inc) following the "tissue 144 protocol". These snail DNA extractions were used as positive PCR and sequencing controls in 145 the following steps. 146 To refine the molecular species assignations obtained from subsequent NGS analyses, 147 we sequenced a 16S rRNA region for six snail species collected in Corsica. The 16Sbr-H 148 primers were used following the PCR conditions described in (Saito et al., 2018). The 149 resulting amplicons contain the barcode region used for the metabarcoding. The PCR products 150 were then sequenced on an ABI 3730xl sequencer at the GenoScreen platform (Lille, France).

PCR primers 155
The primer pair used in this study (Gast01) was previously developed and tested in silico 156 (Taberlet et al., 2018) but never tested and/or applied for empirical purposes. These primers 157 target a 60-70 bp fragment of 16S mitochondrial rRNA. Based on their in silico analyses, 158 these primers theoretically amplify at least 1,280 species distributed among 456 genera, 159 mainly related to Gastropoda. In our study, we first validated that these primers were able to 160 produce a size-expected amplicon using qPCR for the 11 snail species collected during the 161 malacological survey and eight additional snail species from previous field works (Table 1). 162 These 19 DNA extracts were amplified by qPCR on a LightCycler®480 qPCR device 163 following the same protocol as for the first PCR step of the library preparation, see 2.3 below. 164 165

Library preparation and MiSeq amplicon sequencing 166
A total of 34 samples including the 23 filtration membranes from the 23 sampling sites, five 167 negative controls and six positive controls were processed. Negative extraction controls 168 consisted in ultrapure water samples processed following the same DNA extraction protocol 169 as for the filtration membranes in each extraction runs (N = 2). PCR negative controls 170 consisted in PCR reactions performed using water as template in each PCR run (N = 3). 171 Positive controls consisted in two categories of mock communities. The first category 172 consisted in an equimolar pool of 12 DNA extracts obtained from individuals of each snail 173 species identified in Corsica and during previous field missions (Table 1) clusters (i.e. <10 sequences) were removed. (iv) Each OTU was assigned to a species or taxon 214 through a two-step MEGABLAST affiliation procedure. The first MEGABLAST analysis 215 was restricted to Mollusca species. The second MEGABLAST analysis was performed 216 without restricting parameters to check the robustness of the previous affiliations. The 10 best 217 hits were kept for subsequent analysis. (v) The resulting OTUs were filtered following two 218 criteria: first, only OTUs presenting a minimal blast coverage of 84% of amplicon length were 219 kept; second, only OTUs presenting a pairwise identity above 89% with the affiliated 220 sequence were kept. The remaining OTUs were considered as "unassigned". 221 Lastly, to confirm that a given OTU was detected in a specific sample, we have 222 adapted our validation criteria depending on OTUs abundances. An abundant OTU (i.e. 223 ≥ 1000 copies considering all the samples) was considered present in a specific membrane 224 when reaching at least 10 copies (overall the four membrane sections). Regarding poorly 225 abundant OTU (i.e. < 1000 copies considering all the samples), we considered a sample 226 positive if four or more copies of such OTU were distributed among two quarters of a same 227 membrane. 228

230
Environmental variables for the 23 sites are presented in (Table S1). All of these sites were 231 lotic ecosystems, pH values ranged from 7.33 to 8.7, temperatures ranged from 16.5°C to 232 26.7°C and elevation ranged from 4 m to 558 m. 233

Malacological survey 235
Overall, 11 distinct freshwater snail species were identified among the 23 sampled sites 236 (Table 1-S3). These species were already registered in Corsica (INPN, 2020, IUCN, 2020) 237 where 18 species of freshwater mollusc have been described so far, five of which inhabiting 238 lentic ecosystems (Table S3) and (Mouthon, 1982, INPN, 2020. The species richness ranged 239 from one (site 4) to six (site 23). The most common species was P. antipodarum (22/23 sites) 240 and the rarest were T. fluviatilis (1/23 sites) and Bithynia tentaculata (3 individuals at site 13). 241 Among the species identified, two are invasive (i.e. P. antipodarum and P. acuta) and two are 242 Over the 19 snail DNA extracts tested in qPCR with metabarcoding primers (Table 1) Excluding the four contaminated sites, the whole sequencing (controls + field samples) 256 generated more than 14 M of clusters with an average of 97,000 sequences per library. After 257 analyses, 8.7 M of sequences were affiliated (Fig. 2). These sequences were grouped among 258 488 OTUs with 43 OTUs corresponding to Mollusca species. After removing the sequences 259 from control samples, on the 7,290,067 remaining sequences, 6% (445,638) were affiliated to 260 Mollusca; 10% (715,308) were unaffiliated and 84% (6,129,121) were affiliated to non-261 mollusc species (Fig. 2). Whatever the taxonomic group, the OTU affiliations were generally 262 limited to the genus and rarely reached the species level (e.g. only four out of 11 freshwater 263 snail species found by metabarcoding were affiliated to a single species name). However, the 264 16S sequences obtained from snail DNA extracts collected in Corsica, allowed to recover the 265 corresponding species name for major Mollusca OTUs. All the genus or species detected 266 using eDNA corresponded to species already identified in Corsica and no new genus was 267 detected compared to our malacological survey. Hence, considering all prospected sites, we 268 detected 61.1% (11/18) of historically known freshwater molluscs in Corsica and 84.6% 269 (11/13) of species occurring in lotic systems (Table S3) and (INPN, 2020, Mouthon, 1982. 270 The detected snail species belong to three subclasses of the Gastropoda (i.e. Neritimorpha, 271 Caenogastropoda, and Heterobranchia) (Fig. 3). The non-detected subclasses (e.g.  (Fig. 4). Conversely, the malacological 284 survey confirmed the detections obtained by eDNA monitoring for 77% (67/87) of the 285 detections (Fig. 4). Moreover, eDNA monitoring detected on average 1.8±2.84 more species 286 per site than the malacological survey although this difference was not significant (Wilcoxon 287 rank test; W = 24, P = 0.11). Based on the eDNA approach, B. truncatus was detected at 11 288 sites while only visually detected at 4 sites (Figs 4-5a.). At the river scale, the results obtained 289 from direct malacological survey and from the eDNA monitoring approach gave even more 290 congruent results (Fig. 3). 291 Among all the freshwater species detected in Corsica, two species are hosts for the 292 transmission of trematode of medical and veterinary importance (i.e. B. truncatus, Fig. 5a and 293 G. truncatula, Fig. 5b) and two other are invasive species (i.e. P. antipodarum, Fig. 5c and P. 294 acuta, Fig. 5d). The two invasive species display a wide distribution range (17 and 9 sites 295 among the 19 monitored sites irrespective of the method) while the two species of 296 medical/veterinary importance were mainly distributed in the Southern part of the Island. operator (≈ 30 minutes from sampling to sample preservation) is needed (Mulero et al., 2020). 316 In comparison, the time needed to realise six measure units in classical malacological survey 317 is much longer (>2 hours) depending on field practicability and experimenter skill level. 318 Moreover, provided that the molecular databases are well documented and free of false 319 molecular taxonomic assignations, the eDNA based approach can provide insights on 320 gastropod diversity at a given site without having malacological expertise. In the present 321 study, we did not try to optimise the volume of water sampled. However, in the case of larger 322 sampling campaigns it might be interesting to determine the lowest volume of water necessary 323 to ensure the same diagnosis efficiency. We recently shown, using a targeted approach, that 324 lower water volumes down to one litre allows detecting B. truncatus without losses in efficacy 325 (Mulero et al., 2020). 326 Using the eDNA monitoring approach, we tended to detect more species per site than 327 with the malacological survey. Indeed, over the 87 detection events found using the eDNA 328 monitoring, 77% of them (67) were corroborated by visual observations during the 329 malacological survey. This apparent higher richness found by eDNA metabarcoding approach 330 compared to classical survey was already reported in previous studies (Deiner et  Regarding primer amplification biases, further applications of the current tool are needed to 350 evaluate these errors considering that it is impossible to test the primers with the DNA of all 351 known organisms in vitro. In our study, such biases are likely to be limited because our 352 controlled mock communities were detected with no apparent amplification & sequencing 353 biases between species. Concerning DNA contamination, the four sites that could have been 354 contaminated were discarded, moreover, the detections found using eDNA monitoring yet not 355 confirmed by malacological survey, followed an ecologically realistic distribution. For G. truncatula, the two sites found positive only by eDNA monitoring were distributed on the 357 Gravona River, downstream from a site where G. truncatula was detected by the two 358 approaches (i.e. site 5). The two other sites were distributed on the Rizzanese river in which 359 Lymnaeid snails hosts of F. hepatica were previously identified (Gretillat, 1963). Noteworthy, 360 this species prefers temporary water body, but could be found in river shore and associated 361 trickle (Dreyfuss et al., 2009), which might explain the observed relatively low densities. 362 Regarding B. truncatus, the detection differences are the most noticeable between the 363 two approaches, the seven additional sites were mainly distributed in the Gravona River and 364 nearby rivers. Even if these non-randomly distributed detections support rather the non-365 detection of certain species by malacological survey, than false positive eDNA detection, a 366 thorough field survey of these rivers is needed to validate our results.  (Nicholson et al., 2020). In this way, we collected environmental variables at all 384 sampling sites including some that are generally neglected (e.g. pH, waterflow, water 385 temperature; Nicholson et al., 2020). However, the sampling of a wider diversity of 386 freshwater environment is still needed to accurately assess the effect of these variables on our 387 ability to detect species. Fasciola hepatica is another trematode species already present in Corsica that uses G. 404 truncatula as intermediate host (Oviedo et al., 1996). This parasite is responsible for the 405 zoonotic Fasciolosis disease generally infecting livestock and Humans (Valero et al., 2002).    (Table S1) for coordinates and environmental variables. The sites in black were sampled 616 but removed from the analysis due to potential DNA contaminations (see text for details). 617   Columns represent species and are arranged symmetrically between the two detection methods. 638 The number in each cell is the semi-quantitative abundance of each species at each site. 639 America. Black dot are sites were the given species is absent based for the two monitoring 647 methods; Purple circles are sites were the given species was detected with the two monitoring 648 methods; Blue circles are sites were the given species was detected using the eDNA monitoring 649 approach only; Orange circles are sites were the given species was detected using the 650 malacological survey only. 651