Microbial communities play a major role in various environmental processes. The engineering of these ecosystems is largely based on the control of their diversity. To date 16S/18S rDNA amplicon sequencing, a biodiversity marker gene, is the most efficient and cost less way to access to the specific diversity of an ecosystem. However, the bottleneck of these approaches is currently linked to the flood of data that challenge current computing architectures and refinement of processing algorithms. The challenge is to propose solutions that optimize the processing of these data both in terms of infrastructure (disk space, user friendly) as well as computation time.
In this context we developed the pipeline FROGS*: « Find Rapidly OTU with Galaxy Solution ». Developed for the Galaxy platform, FROGS offers a friendly interface and includes several bioinformatics tools for pretreatmen, cleaning, dereplication, chimera removing, clustering in OTU (Operational Taxonomic Unit)and taxonomic assignment of sequences]. FROGS generates an OTU’s abundance table with the taxonomic affiliation for each OTU. And finally, the post processing tool allows users to process this table with the user-specified filters and provides statistical results and numerous graphical illustrations of these data. FROGS has been developed to be very fast even on large amounts of data in using cutting-edge tools and an optimized design. Also it is portable on all Galaxy platforms with a minimum of computing and architecture dependencies. FROGS was tested on several simulated data sets. The tool is extremely rapid, robust and highly sensitive for the detection of OTU with very few false positives compared to other pipelines widely used by the community.
Currently operational in exploring the bacterial and eukaryotic communities by analysis of 16S, 18S and 23S DNA, we are currently working on the integration of new databases allowing analysis of the diversity of fungi.
This tool is available in three platforms Galaxy (Toulouse – Sigenae, Jouy - Migale, Rennes - GenOuest) and all programs can be downloaded via the collaborative site github : https://github.com/geraldinepascal/FROGS.git . A training was delivered in June 2015 and three more will be delivered between December and April 2016 in collaboration with the Migale platform. In 9 months, 40 people will be trained in the use of Galaxy, FROGS and statistics related to microbial diversity. FROGS was presented (Oral or Poster) to NEM network at Toulouse (O), to Pathobiom 2015 at Maison-Alfort (O), to JOBIM 2015 at Clermont-Ferrand (O + P), to RCAM 2015 at Paris (O), to the Environmental Genomics conference at Montpellier (O + P), to AFEM 2015 at Anglet (P), to the Galaxy days 2015 at the Pasteur Institute (O).