Metabarcoding
Type of resources
Keywords
Contact for the resource
Years
status
Groups
-
Background Freshwater ecosystems have been profoundly affected by habitat loss, degradation, and overexploitation, leaving them now especially vulnerable to biological invasions. Whether non-indigenous species are the key drivers or mere complementary factors of biodiversity loss is still debated among the scientific community, however biological invasions together with other anthropogenic stressors are determining population declines and homogenisation of biodiversity in freshwater ecosystems worldwide. For example, it has been demonstrated that river basins with greater numbers of non-indigenous species have higher extinction rates of native fish species. Consequently, the application of effective biomonitoring approaches to support protection actions of managers, stakeholders and policy-makers is nowadays essential. Introduction Conventional methods of monitoring freshwater fish diversity are based on direct observation of organisms and are therefore costly, labour and resource intensive, require taxonomic expertise, and can be invasive. Obtaining information about species and communities by retrieving DNA from environmental samples has the ability to overcome some of these difficulties. The molecular investigation of environmental samples is known as environmental DNA (eDNA). Environmental DNA can be isolated from water, soil, air or faeces as organisms shed their genetic material in the surroundings through metabolic waste, damaged tissues, sloughed skin cells and decomposition. The analysis of eDNA consists of extracting the genetic material and subjecting it to a Polymerase Chain Reaction (PCR) which amplifies the target DNA. The use of high-throughput sequencing (HTS) allows the simultaneous identification of many species within a certain taxonomic group. This community-wide approach is known as eDNA metabarcoding and involves the use of broad-range primers during PCR that amplify a set of species. In recent years, the cost of this technology has drastically decreased, making it very attractive in conservation management and scientific research. A number of studies have demonstrated that eDNA metabarcoding is more sensitive than conventional biomonitoring methods for freshwater fish as it can detect rare or low-abundance taxa. As a result, eDNA metabarcoding can be used as an early-warning tool to detect new NIS at the initial stages of colonisation, when they are not yet abundant in the ecosystem. Aims This validation case regards eDNA metabarcoding fish sequences collected from the Douro Basin in Portugal. DNA sequences are processed through a bioinformatic pipeline wrapped in the first part of the analytical workflow which conducts a quality check and assigns the DNA sequences to produce a list of taxa. The analytical workflow developed can process DNA sequences of different kinds, depending on the genetic markers used for the analysis and so this workflow can be applied to different taxonomic groups and ecosystems. The taxa identified might include indigenous organisms as well as newly identified taxa within a certain geographical region. For that reason, the national checklists of introduced and invasive species (GRISS) from GBIF are consulted to check if the organisms detected are recognised as NIS or if previously unrecorded NIS have been detected through eDNA metabarcoding analysis.
-
This is one of the three input files required for running Step 1 (Metabarcoding Runner) of the Metabarcoding workflow. This file in Darwin Core format (https://dwc.tdwg.org/terms/) contains eDNA sample collection information including the location and the geographical coordinates of the eDNA sampling sites. The other two input files for Step 1 are the "Fastq" file with sequences generated from High Throughput Sequencing of eDNA samples and the "Reference_db" with reference sequences for the taxonomic assignment of eDNA sequences.
-
This is one of the three input files required for running Step 1 (Metabarcoding Runner) of the Metabarcoding workflow. This file contains a list of species and reference genetic sequences that will be used for the taxonomic assignment of eDNA metabarcoding sequences. The other two input files for Step 1 are the "Fastq" file with sequences generated from High Throughput Sequencing of eDNA samples and the "Sample_metadata" with eDNA samples information.
-
This service represents the Step 2 of the Metabarcoding Workflow within the Internal Joint Initiative. It aims at converting the CSV into rdata. It takes as input the Species_occ.csv file (output of the Step 1 Metabarcoding Runner), verifies the checklists available for each country and retrieves the first one. It produces two rdata files, that will be the inputs for the Step 3 GBIF NIS Verifier of the Metabarcoding workflow.
-
This service aims to compute a list of taxa IDs detected from metabarcoding sequences of environmental DNA (eDNA) samples. This service contains eight substeps (1.1 to 1.8) implemented as a unique step. The eight substeps perform (1) sequencing error correction (using BayesHammer-SPAdes); (2) pairwise alignment, (3) pre-filtering, (4) dereplication, (5) attribute filtering, (6) clustering and OTU tab-producer (using OBITools); (7) taxonomic assignment (using blastn); (8) OTUs table generator. Several types of eDNA samples can be processed (i.e. water, feces, soil). It represents the Step 1 of the Metabarcoding Workflow within the Internal Joint Initiative.
-
This service extracts from GBIF (Global Biodiversity Information Facility) taxa occurrence records that were not classified as NIS in Step 3 (GBIF NIS Verifier) of the Metabarcoding workflow. This step aims to verify if eDNA detections were identified outside the known distribution range using GBIF occurrences records. Additionally, for such eDNA detections, this step produces a geographical spatial polygon based on occurrence records available in GBIF. It represents the Step 4 of the Metabarcoding Workflow within the Internal Joint Initiative.
-
This service intersects the map(s) (geographical spatial polygons) from GBIF occurrence(s) with the locations of a list of taxa detected from eDNA metabarcoding and which were not included in the NIS checklist to verify if such eDNA detection(s) is/are likely to be new NIS detected in that location from eDNA sequences. It represents the Step 5 of the Metabarcoding Workflow within the Internal Joint Initiative.
-
This service aims at checking which taxa detected and identified from eDNA metabarcoding sequences are listed as NIS and which taxa are not (i.e. native or unrecorded NIS). This service uses available GBIF (Global Biodiversity Information Facility) records of NIS for each Country provided by the Invasive Species Specialist Group (ISSG). it checks if a species is present in the checklist and if this is the case, the species is flagged as being invasive for that country by adding 1 (yes) or 0 (no) to the column isInChecklist of the data frame, and the checklist key or a note to the ref_checklistKey column, for the corresponding cases. It represents the Step 3 of the Metabarcoding Workflow within the Internal Joint Initiative.
-
This is one of the three input files required for running Step 1 (Metabarcoding Runner) of the Metabarcoding workflow. The fastq file contains DNA sequences generated from High Throughput Sequencing of eDNA samples. The other two input files for Step 1 are the "Reference_db" and the "Sample_metadata" containing respectively the reference genetic sequences for the taxonomic assignment and the eDNA samples information.
-
PEMA is a HPC-centered, containerized assembly of key metabarcoding analysis tools. It supports the downstream analysis of four marker genes (16S/18S rRNA, ITS and COI) but also, by allowing the user to train the classifiers with custom reference databases, it can be used for further marker genes. By combining state-of-the art technologies and algorithms with an easy to get-set-use framework, PEMA allows researchers to tune thoroughly each study thanks to roll-back checkpoints and on-demand partial pipeline execution features.