FN Archimer Export Format PT J TI Analysis of the genomic basis of functional diversity in dinoflagellates using a transcriptome-based sequence similarity network BT AF MENG, Arnaud CORRE, Erwan PROBERT, Ian GUTIERREZ-RODRIGUEZ, Andres SIANO, Raffaele ANNAMALE, Anita ALBERTI, Adriana DA SILVA, Corinne WINCKER, Patrick LE CROM, Stephane NOT, Fabrice BITTNER, Lucie AS 1:1;2:2;3:3;4:4;5:5;6:6,7,8;7:6,7,8;8:6,7,8;9:6,7,8;10:1;11:9;12:1; FF 1:;2:;3:;4:;5:PDG-ODE-DYNECO-PELAGOS;6:;7:;8:;9:;10:;11:;12:; C1 Sorbonne Univ, UPMC Univ Paris 06, Univ Antilles Guyane, Univ Nice Sophia Antipolis,CNRS,EPS IBPS, Paris, France. UPMC, CNRS, FR2424, ABiMS,Stn Biol, Roscoff, France. UPMC, CNRS, FR2424, Roscoff Culture Collect,Stn Biol Roscoff, Pl Georges Teissier, Roscoff, France. Natl Inst Water & Atmospher Res NIWA Ltd, Wellington, New Zealand. IFREMER, Ctr Brest, DYNECO PELAGOS, Plouzane, France. CEA, Inst Genom, GENOSCOPE, Evry, France. CNRS, UMR8030, Evry, France. Univ Evry Val dEssonne, Evry, France. CNRS, UMR 7144, Stn Biol Roscoff, Pl Georges Teissier, Roscoff, France. C2 UNIV PARIS 06, FRANCE UNIV PARIS 06, FRANCE UNIV PARIS 06, FRANCE NIWA, NEW ZEALAND IFREMER, FRANCE CEA, FRANCE CNRS, FRANCE UNIV EVRY VAL DESSONNE, FRANCE CNRS, FRANCE SI BREST SE PDG-ODE-DYNECO-PELAGOS IN WOS Ifremer jusqu'en 2018 copubli-france copubli-univ-france copubli-int-hors-europe IF 5.855 TC 10 UR https://archimer.ifremer.fr/doc/00444/55550/57209.pdf LA English DT Article DE ;genomics;proteomics;microbial biology;molecular evolution;protists;transcriptomics AB Dinoflagellates are one of the most abundant and functionally diverse groups of eukaryotes. Despite an overall scarcity of genomic information for dinoflagellates, constantly emerging high-throughput sequencing resources can be used to characterize and compare these organisms. We assembled de novo and processed 46 dinoflagellate transcriptomes and used a sequence similarity network (SSN) to compare the underlying genomic basis of functional features within the group. This approach constitutes the most comprehensive picture to date of the genomic potential of dinoflagellates. A core-predicted proteome composed of 252 connected components (CCs) of putative conserved protein domains (pCDs) was identified. Of these, 206 were novel and 16 lacked any functional annotation in public databases. Integration of functional information in our network analyses allowed investigation of pCDs specifically associated with functional traits. With respect to toxicity, sequences homologous to those of proteins found in species with toxicity potential (e.g., sxtA4 and sxtG) were not specific to known toxin-producing species. Although not fully specific to symbiosis, the most represented functions associated with proteins involved in the symbiotic trait were related to membrane processes and ion transport. Overall, our SSN approach led to identification of 45,207 and 90,794 specific and constitutive pCDs of, respectively, the toxic and symbiotic species represented in our analyses. Of these, 56% and 57%, respectively (i.e., 25,393 and 52,193 pCDs), completely lacked annotation in public databases. This stresses the extent of our lack of knowledge, while emphasizing the potential of SSNs to identify candidate pCDs for further functional genomic characterization. PY 2018 PD MAY SO Molecular Ecology SN 0962-1083 PU Wiley VL 27 IS 10 UT 000433589000004 BP 2365 EP 2380 DI 10.1111/mec.14579 ID 55550 ER EF