Metabarcoding Runner
This service aims to compute a list of taxa IDs detected from metabarcoding sequences of environmental DNA (eDNA) samples. This service contains eight substeps (1.1 to 1.8) implemented as a unique step. The eight substeps perform (1) sequencing error correction (using BayesHammer-SPAdes); (2) pairwise alignment, (3) pre-filtering, (4) dereplication, (5) attribute filtering, (6) clustering and OTU tab-producer (using OBITools); (7) taxonomic assignment (using blastn); (8) OTUs table generator. Several types of eDNA samples can be processed (i.e. water, feces, soil).
It represents the Step 1 of the Metabarcoding Workflow within the Internal Joint Initiative.
Default
Identification
- Date ( Publication )
- 2021-03-23
- Date ( Creation )
- 2019-07-04
- Status
- Under development / Pre operational
- Version
- 1.0
- Keywords
- Bioinformatics
- Keywords
- NGS data
- Keywords
- Environmental DNA
- Keywords
- Metabarcoding pipeline
- Keywords
- Taxonomic assignment
- Keywords
- Metabarcoding
- Keywords
- IJI
- Access constraints
- License
- Use limitation
- The final license will be available soon.
- Protocol
- DOI
- Operation name
- Fastq
- Description
- This is one of the three input files required for running this service. The fastq file contains DNA sequences generated from High Throughput Sequencing of eDNA samples.
- Function
- Input file
- Operation name
- Reference_db
- Description
- This is one of the three input files required for running this service. This file contains a list of species and reference genetic sequences that will be used for the taxonomic assignment of eDNA metabarcoding sequences.
- Function
- Input file
- Operation name
- Sample_metadata.tsv
- Description
- This is one of the three input files required for running this service. This file in Darwin Core format (https://dwc.tdwg.org/terms/) contains eDNA sample collection information including the location and the geographical coordinates of the eDNA sampling sites.
- Function
- Input file
- Operation name
- Species_occ.csv
- Description
- This file is a taxa table containing the number of DNA sequences assigned to each taxon detected using High Throughput Sequencing of environmental DNA samples.
- Function
- Output file
- Service Category
- data processing
- Service Category
- data analysis
- Service Category
- data selection
- Service Language
- eng
- Service TRL
- TRL 7 – System prototype demonstration in operational environment