Frequently Asked Questions

Is there a FROGS Guidelines or standard procedure?

FROGS' design is highly modular and so allows users to choose their tools and processing order. However, default values are advised when possible, and a standard procedure for amplicon analysis should follow these steps:
  1. Pre-processing: depending on your data, assemble or not your paired-reads. Depending on your studied amplicon, fill the size parameters and primers fields.
  2. Clustering: thanks to Swarm capacities, clustering can be performed early in the process. It should be performed with an aggregation distance of 3 and with a denoising step.
  3. Removing chimeras: chimeras are PCR artifacts and should be removed at this step, using the clusters produced by Swarm.
  4. Filtering OTUs: a 0.005% abundance threshold should be apply to remove the remaining noisy clusters and obtain your OTUs. If your experimental design contain replicates, you should also filter clusters which are not present in at least two/three/more samples (depending on your design).
  5. Taxonomic affiliation: this step should be executed at the end of the process because it is the most time consuming one. Default is to produce blast affiliation and multi-affiliations, but RDP affiliation can be added.
  6. Visualization (optional): use “Cluster stat” (after steps 2, 3 and 4) for some supplementary figures and stats about your clusters (numbers, distributions etc.). Use “Affiliations stat” for some supplementary figures and stats (after step 5)
  7. Tree construction (optional): use it after step 4) if you want a phylogenetic tree of your OTUs
  8. Export functions (optional): use “BIOM to TSV” if you want an abundance table in tabular format. Use “BIOM to standard BIOM” if you need a BIOM file for your statistical analyses.

What data are processed by FROGS?

FROGS is designed for the study of microbial communities from amplicon sequencing. The amplified area is chosen to be as distinctive as possible in the community you are interested in. For example, researchers favour 16S ribosomal RNA part when studying the bacterial composition of an environment.

However, FROGS can be used on any amplicon as long as the area of interest respects the constraints mentioned below:

In standard protocol, target DNA must be completely sequenced in the reads i.e. either a single-end reads starting from 5’ primer and finishing to the end of the 3’ primer or in paired-end case the forward and reverse reads must be overlapped.

Which filters should I use in FROGS-Filters?

FROGS proposes various filters to meet users'needs, but depending on your data and scientific question, only a part of them should be used. The most used filters are the OTU filters based on samples and abundances.

How to fill the 5’ and 3’ primers fields in FROGS-Preprocess tool?

The most frequent error encountered by FROGS new users is a wrong completion of their 5’ and 3’ primers. Make sure that you followed the instructions available at the bottom of the tool: primers should be provided as they are read on a 5’->3’ sequence. Generally, that means that your 3’ primers should be reverse-transcripted.

Example:

5' ATGCCC GTCGTCGTAAAATGC ATTTCAG 3'
Value for parameter 5' primer: ATGCC
Value for parameter 3' primer: ATTTCAG

Degenerated nucleotides are accepted.

What is the “custom protocol” parameter in FROGS-Preprocess tool?

(Illumina data only) This custom protocol corresponds to Kozich et al. (2013) protocol, where PCR primers are also used as sequencing adaptors, and so PCR primers are not included in the obtained sequences. Thus, choose this parameter only if you used such a sequencing protocol, otherwise, select the standard protocol.

What is the “mismatch rate” parameter in FROGS-Preprocess tool?

(paired reads Illumina only) The mismatch rate corresponds to the mismatch rate allowed when pairing paired-end reads using Flash in the FROGS-Preprocess tool. A 0.1 rate means that 10% mismatches are allowed along the overlapping region. Sequences presenting more mismatches will be discarded.

What are the “minimum/maximum amplicon size” parameters in FROGS-Preprocess tool?

(paired reads Illumina only) These parameters correspond to the sizes below/above which assembled paired reads will be discarded. They allow to filter badly assembled paired reads.

What is Swarm and how does it work?

Swarm is a novel clustering algorithm which does not rely on a fixed global clustering threshold. For more information about Swarm, consult https://github.com/torognes/swarm or Mahé F, et al. (2014) and Mahé F, et al. (2015)