workflow
Type of resources
Keywords
Contact for the resource
Years
status
Groups
-
The aim of the (Taxonomic) Data Refinement Workflow is to provide a streamlined workflow environment for preparing observational and specimen data sets for use in scientific analysis on the Taverna platform. The workflow has been designed in a way that, accepts input data in a recognized format, but originating from various sources (e.g. services, local user data sets), includes a number of graphical user interfaces to view and interact with the data, the output of each part of the workflow is compatible with the input of each part, implying that the user is free to choose a specific sequence of actions, allows for the use of custom-built as well as third-party tools applications and tools. This workflow can be accessed through the BioVeL Portal here http://biovelportal.vliz.be/workflows?category_id=1 This workflow can be combined with the Ecological Niche Modelling Workflows. http://marine.lifewatch.eu/ecological-niche-modelling Developed by: Biodiversity Virtual e-Laboratory (BioVeL) (EU FP7 project) Technology or platform: The workflow has been developed to be run in the Taverna automated workflow environment.
-
This workflow aims to analyze diverse soil datasets using PCA to understand physicochemical properties. The process starts with converting SPSS (.sav) files into CSV format for better compatibility. It emphasizes variable selection, data quality improvement, standardization, and conducting PCA for data variance and pattern analysis. The workflow includes generating graphical representations like covariance and correlation matrices, scree plots, and scatter plots. These tools aid in identifying significant variables, exploring data structure, and determining optimal components for effective soil analysis. Background Understanding the intricate relationships and patterns within soil samples is crucial for various environmental and agricultural applications. Principal Component Analysis (PCA) serves as a powerful tool in unraveling the complexity of multivariate soil datasets. Soil datasets often consist of numerous variables representing diverse physicochemical properties, making PCA an invaluable method for: ∙Dimensionality Reduction: Simplifying the analysis without compromising data integrity by reducing the dimensionality of large soil datasets. ∙Identification of Dominant Patterns: Revealing dominant patterns or trends within the data, providing insights into key factors contributing to overall variability. ∙Exploration of Variable Interactions: Enabling the exploration of complex interactions between different soil attributes, enhancing understanding of their relationships. ∙Interpretability of Data Variance: Clarifying how much variance is explained by each principal component, aiding in discerning the significance of different components and variables. ∙Visualization of Data Structure: Facilitating intuitive comprehension of data structure through plots such as scatter plots of principal components, helping identify clusters, trends, and outliers. ∙Decision Support for Subsequent Analyses: Providing a foundation for subsequent analyses by guiding decision-making, whether in identifying influential variables, understanding data patterns, or selecting components for further modeling. Introduction The motivation behind this workflow is rooted in the imperative need to conduct a thorough analysis of a diverse soil dataset, characterized by an array of physicochemical variables. Comprising multiple rows, each representing distinct soil samples, the dataset encompasses variables such as percentage of coarse sands, percentage of organic matter, hydrophobicity, and others. The intricacies of this dataset demand a strategic approach to preprocessing, analysis, and visualization. This workflow centers around the exploration of soil sample variability through PCA, utilizing data formatted in SPSS (.sav) files. These files, specific to the Statistical Package for the Social Sciences (SPSS), are commonly used for data analysis. To lay the groundwork, the workflow begins with the transformation of an initial SPSS file into a CSV format, ensuring improved compatibility and ease of use throughout subsequent analyses. Incorporating PCA offers a sophisticated approach, enabling users to explore inherent patterns and structures within the data. The adaptability of PCA allows users to customize the analysis by specifying the number of components or desired variance. The workflow concludes with practical graphical representations, including covariance and correlation matrices, a scree plot, and a scatter plot, offering users valuable visual insights into the complexities of the soil dataset. Aims The primary objectives of this workflow are tailored to address specific challenges and goals inherent in the analysis of diverse soil samples: ∙Data transformation: Efficiently convert the initial SPSS file into a CSV format to enhance compatibility and ease of use. ∙Standardization and target specification: Standardize the dataset and designate the target variable, ensuring consistency and preparing the data for subsequent PCA. ∙PCA: Conduct PCA to explore patterns and variability within the soil dataset, facilitating a deeper understanding of the relationships between variables. ∙Graphical representations: Generate graphical outputs, such as covariance and correlation matrices, aiding users in visually interpreting the complexities of the soil dataset. Scientific questions This workflow addresses critical scientific questions related to soil analysis: ∙Variable importance: Identify variables contributing significantly to principal components through the covariance matrix and PCA. ∙Data structure: Explore correlations between variables and gain insights from the correlation matrix. ∙Optimal component number: Determine the optimal number of principal components using the scree plot for effective representation of data variance. ∙Target-related patterns: Analyze how selected principal components correlate with the target variable in the scatter plot, revealing patterns based on target variable values.
-
This workflow streamlines the export, preprocessing, and analysis of phytosociological inventories from a project database. The workflow's goals include exporting and preprocessing inventories, conducting statistical analyses, and using interactive graphs to visualize species dominance, altitudinal distribution, average coverage, similarity clusters, and species interactions. It also calculates and visualizes the fidelity index for species co-occurrence. This workflow addresses key scientific questions about dominant species, distribution patterns, species coverage, inventory similarity, species interactions, and co-occurrence probabilities, aiding efficient vegetation management in environmental projects. Background Efficient vegetation management in environmental projects necessitates a detailed analysis of phytosociological inventories. This workflow streamlines the export and preprocessing of vegetation inventories from the project database. Subsequently, it conducts various statistical analyses and graphical representations, offering a comprehensive view of plant composition and interactions. Introduction In the realm of vegetation research, the availability of phytosociological data is paramount. This workflow empowers users to specify parameters for exporting vegetation inventories, performs preprocessing, and conducts diverse statistical analyses. The resulting insights are visually represented through interactive graphs, highlighting predominant species, altitudinal ranges of plant communities, average species coverage, similarity clusters, and interactive species interactions. Aims The primary objectives of this workflow are tailored to address specific challenges and goals inherent in the analysis of phytosociological inventories: 1. Export and Preprocess Inventories: Enable the export and preprocessing of phytosociological inventories stored in the project database. 2. Statistical Analyses of Species and Plant Communities: Conduct detailed statistical analyses on the species and plant communities present in the inventories. 3. Interactive Graphical Representation: Utilize interactive graphs to represent predominant species, altitudinal ranges of plant communities, and average species coverage. 4. Similarity Dendrogram: Generate a dendrogram grouping similar phytosociological inventories based on the similarity of their species content. 5. Interactive Species Interaction Analysis: Visualize species interactions through interactive graphs, facilitating the identification of species that tend to coexist. 6. Calculation and Visualization of Fidelity Index: Calculate the fidelity index between species and visually represent the probability of two or more species co-occurring in the same inventory. Scientific Questions This workflow addresses critical scientific questions related to the analysis of phytosociological inventories: - Dominant Species Identification: Which species emerge as predominant in the phytosociological inventories, and what is their frequency of occurrence? - Altitudinal Distribution Patterns: How are plant communities distributed across altitudinal ranges, and are there discernible patterns? - Average Species Coverage Assessment: What is the average coverage of plant species, and how does it vary across different inventories? - Similarity in Inventory Content: How are phytosociological inventories grouped based on the similarity of their species content? - Species Interaction Dynamics: Which species exhibit notable interactive dynamics, and how can these interactions be visualized? - Fidelity Between Species: What is the likelihood that two or more species co-occur in the same inventory, and how does this fidelity vary across species pairs?
-
Accurately mapping vegetation is crucial for environmental monitoring. Traditional methods for identifying shrubs are labor-intensive and impractical for large areas. This workflow uses remote sensing and deep learning to detect Juniperus shrubs from high-resolution RGB satellite images, making shrub identification more efficient and accessible to non-experts in machine learning. Background In a dynamic climate, accurately mapping vegetation distribution is essential for environmental monitoring, biodiversity conservation, forestry, and urban planning. One important application of vegetation mapping is the identification of shrub individuals. We term by shrub identification, detection of shrub location and segmentation of shrub morphology. Introduction Yet, shrub species monitoring is a challenging task. Ecologists used to identify shrubs using classical field surveying methods, however, this process poses a significant challenge since the shrubs are often distributed in large areas that are most of the time inaccessible. Thus, these methods are considered labor-intensive, costly, time-consuming, unsustainable, limited to a small spatial and temporal scale, and their data are often not publicly available. Combining remote sensing and deep learning, however, can play a significant role in tackling these challenges providing a great opportunity to improve plant surveying. First, remote sensing can offer highly detailed spatial resolution granting exceptional flexibility in data acquisition. Then, these data can be afterward processed by deep learning models for automatic identification of shrubs. Aims The objective of this workflow is to help scientists, non-expert in machine learning, detect Juniperus shrubs from RGB very-high resolution satellite images using deep learning and remote sensing tools. Scientific Questions Can we accurately detect high-mountain Juniperus shrubs from RGB very-high resolution satellite images using deep learning?
-
The Ecological Niche Modelling Workflows offer an extensible framework for analyzing or predicting the impact of environmental changes on the distribution of biodiversity. Especially in combination with data aggregation workflows like the Taxonomic Data Refinement Workflow, the Ecological Niche Modelling workflows facilitate the analysis of species distribution patterns over large geo-temporal, taxonomic, and environmental scales. Examples for applications are studies of species adaptations to climate change, dynamic modeling of ecologically related species, identification of regions with accumulated risk for invasion, potential for restoration, or natural protected areas. Developed by: The Biodiversity Virtual e-Laboratoy (BioVeL) (EU FP7 project) Technology or platform: These workflows have been developed to be run in the Taverna automated workflow environment (https://incubator.apache.org/projects/taverna.html). In their current form, the workflow files (with the .t2flow extension) can be loaded and executed in the workbench variant of Taverna. They have been tested with Taverna Workbench version 2.4. These workflows can also be run in BioVeL Portal, a light weight user interface which allows browsing, reviewing and running Taverna Workflows without the need of installing any software.
-
Background Biological invasions are acknowledged to be significant environmental and economic threats, yet the identification of key ecological traits determining invasiveness of species has remained elusive. One unappreciated source of variation concerns dietary flexibility of non-native species and their ability to shift trophic position within invaded food webs. Trophic plasticity may greatly influence invasion success as it facilitates colonisation, adaptation, and successful establishment of non-native species into new territories. In addition, having a flexible diet gives the introduced species a better chance to become invasive and, as a consequence, to have a strong impact on food webs, determining secondary disruptions such as trophic cascades and changes in energy fluxes. The deleterious effects can affect multiple trophic levels. Introduction Crustaceans are considered the most successful taxonomic group of aquatic invaders worldwide. Their ability to colonise and easily adapt to new ecosystems can be ascribed to a number of ecological features including their omnivorous feeding behaviour. This validation case study focuses on two invasive crustaceans widely distributed in marine and freshwater European waters: the Atlantic blue crab Callinectes sapidus and the Louisiana crayfish or red swamp crayfish Procambarus clarkii. Callinectes sapidus and Procambarus clarkii are opportunistic omnivores that feed on a variety of food sources from detritus to plants and invertebrates. For this reason, they represent a good model to investigate the variation of trophic niches in invaded food webs and their ecological impact on native communities. The ecological consequences of the invasion and establishment of these invasive crustaceans can vary from modification of carbon cycles in benthic food webs to regulation of prey/predator abundance through bottom-up and top-down interactions. Understanding how the trophic ecology of these invasive crustaceans shapes benthic food webs in invaded ecosystems is crucial for an accurate assessment of their impact. The analysis of stable isotopes can provide important clues on the trophic effects of invasive species within non-native ecosystems by evaluating changes in their trophic position and characteristics of their trophic niche. Aims This validation case uses a collection of stable isotopes (δ13C and δ15N) of C. sapidus and P. clarkii and their potential prey in invaded food webs to quantify changes in the trophic position of the invaders and to assess post-invasion shifts in their dietary habits. This case study additionally evaluates the main environmental drivers involved in trophic niche adaptations and whether such bioclimatic predictors influence broad-scale patterns of variation in the trophic position of the invader.
-
This workflow aims to compare plant species across different natural spaces. The workflow involves downloading and filtering phytosociological inventories, preprocessing data, and unifying it for comparative analysis. The main outputs are a Venn diagram displaying shared and unique species, and a CSV table detailing common and uncommon species. The workflow addresses filter application effectiveness, Venn diagram clarity, species table accuracy, and overall efficiency in processing and visualization, supporting ecological studies of plant distribution. Background Comparative analysis of phytosociological inventories across different natural spaces is essential for understanding plant distribution. This workflow focuses on downloading inventories stored in the database, applying distinct filters for each natural space, and conducting a comparative analysis of shared and unique plant species. The primary output includes a Venn diagram representing species intersections and a CSV table detailing common and uncommon plant species across the selected natural spaces. Introduction In ecological studies, understanding the overlap and uniqueness of plant species across different natural spaces is crucial. This workflow employs phytosociological inventories stored in the database, downloading them separately for each natural space using specific filters. The workflow then conducts a comparative analysis, identifying shared and unique plant species. The visualization includes a Venn diagram for easy interpretation and a CSV table highlighting the common and uncommon species across the selected natural spaces. Aims The primary aim of this workflow is to facilitate the comparison of phytosociological inventories from different natural spaces, emphasizing shared and unique plant species. The workflow includes the following key components: - Inventory Download and Preprocessing: Downloads phytosociological inventories from the database, applies specific filters for each natural space, and preprocesses the data to retain only the species present in each zone. - Data Unification: Unifies the processed data into a single dataset, facilitating comparative analysis. - Venn Diagram Representation: Generates a Venn diagram to visually represent the overlap and uniqueness of plant species across the selected natural spaces. - Species Table Generation: Creates a CSV table showcasing common and uncommon plant species in the selected natural spaces. Scientific Questions - Filter Application Effectiveness: How effectively does the workflow apply distinct filters to download inventories for each natural space? - Venn Diagram Interpretation: How intuitive and informative is the Venn diagram representation of shared and unique plant species across the selected natural spaces? - Species Table Accuracy: How accurate is the CSV table in presenting common and uncommon plant species in the comparative analysis? - Workflow Efficiency: How efficiently does the workflow streamline the entire process, from data download to visualization, for comparative phytosociological analysis?
-
This workflow aims to efficiently integrate floral sample data from Excel files into a MongoDB database for botanical projects. It involves verifying and updating taxonomic information, importing georeferenced floral samples, converting data to JSON format, and uploading it to the database. This process ensures accurate taxonomy and enriches the database with comprehensive sample information, supporting robust data analysis and enhancing the project's overall dataset. Background Efficient management of flora sample data is essential in botanical projects, especially when integrating diverse information into a MongoDB database. This workflow addresses the challenge of incorporating floral samples, collected at various sampling points, into the MongoDB database. The database is divided into two segments: one storing taxonomic information and common characteristics of taxa, and the other containing georeferenced floral samples with relevant information. The workflow ensures that, upon importing new samples, taxonomic information is verified and updated, if necessary, before storing the sample data. Introduction In botanical projects, effective data handling is pivotal, particularly when incorporating diverse flora samples into a MongoDB database. This workflow focuses on importing floral samples from an Excel file into MongoDB, ensuring data integrity and taxonomic accuracy. The database is structured into taxonomic information and a collection of georeferenced floral samples, each with essential details about the collection location and the species' nativity. The workflow dynamically updates taxonomic records and stores new samples in the appropriate database sections, enriching the overall floral sample collection. Aims The primary aim of this workflow is to streamline the integration of floral sample data into the MongoDB database, maintaining taxonomic accuracy and enhancing the overall collection. The workflow includes the following key components: - Taxonomy Verification and Update: Checks and updates taxonomic information in the MongoDB database, ensuring accuracy before importing new floral samples. - Georeferenced Sample Import: Imports floral samples from the Excel file, containing georeferenced information and additional sample details. - JSON Transformation and Database Upload: Transforms the floral sample information from the Excel file into JSON format and uploads it to the appropriate sections of the MongoDB database. Scientific Questions - Taxonomy Verification Process: How effectively does the workflow verify and update taxonomic information before importing new floral samples? - Georeferenced Sample Storage: How does the workflow handle the storage of georeferenced floral samples, considering collection location and species nativity? - JSON Transformation Accuracy: How successful is the transformation of floral sample information from the Excel file into JSON format for MongoDB integration? - Database Enrichment: How does the workflow contribute to enriching the taxonomic and sample collections in the MongoDB database, and how is this reflected in the overall project dataset?
-
Background Monitoring hard-bottom marine biodiversity can be challenging as it often involves non-standardised sampling methods that limit scalability and inter-comparison across different monitoring approaches. Therefore, it is essential to implement standardised techniques when assessing the status of and changes in marine communities, in order to give the correct information to support management policy and decisions, and to ensure the most appropriate level of protection for the biodiversity in each ecosystem. Biomonitoring methods need to comply with a number of criteria including the implementation of broadly accepted standards and protocols and the collection of FAIR data (Findable, Accessible, Interoperable, and Reusable). Introduction Artificial substrates represent a promising tool for monitoring community assemblages of hard-bottom habitats with a standardised methodology. The European ARMS project is a long-term observatory network in which about 20 institutions distributed across 14 European countries, including Greenland and Antarctica, collaborate. The network consists of Autonomous Reef Monitoring Structures (ARMS) which are deployed in the proximity of marine stations and Long-term Ecological Research sites. ARMS units are passive monitoring systems made of stacked settlement plates that are placed on the sea floor. The three-dimensional structure of the settlement units mimics the complexity of marine substrates and attracts sessile and motile benthic organisms. After a certain period of time these structures are brought up, and visual, photographic, and genetic (DNA metabarcoding) assessments are made of the lifeforms that have colonised them. These data are used to systematically assess the status of, and changes in, the hard-bottom communities of near-coast ecosystems. Aims ARMS data are quality controlled and open access, and they are permanently stored (Marine Data Archive) along with their metadata (IMIS, catalogue of VLIZ) ensuring data fairness. Data from ARMS observatories provide a promising early-warning system for marine biological invasions by: i) identifying newly arrived Non-Indigenous Species (NIS) at each ARMS site; ii) tracking the migration of already known NIS in European continental waters; iii) monitoring the composition of hard-bottom communities over longer periods; and iv) identifying the Essential Biodiversity Variables (EBVs) for hard-bottom fauna, including NIS. The ARMS validation case was conceived to achieve these objectives: a data-analysis workflow was developed to process raw genetic data from ARMS; end-users can select ARMS samples from the ever-growing number available in collection; and raw DNA sequences are analysed using a bioinformatic pipeline (P.E.M.A.) embedded in the workflow for taxonomic identification. In the data-analysis workflow, the correct identification of taxa in each specific location is made with reference to WoRMS and WRiMS, webservices that are used to check respectively the identity of the organisms and whether they are introduced.
-
The workflow "Pollen Trends Analysis with AeRobiology" leverages the AeRobiology library to manage and analyze time-series data of airborne pollen particles. Aimed at understanding the temporal dynamics of different pollen types, this workflow ensures data quality, profiles seasonal trends, and explores temporal variations. It integrates advanced features for analyzing pollen concentrations and their correlation with meteorological variables, offering comprehensive insights into pollen behavior over time. The workflow enhances data accessibility, facilitating broader research and public health applications. Background In the dynamic landscape of environmental research and public health, the AeRobiology library (https://cran.r-project.org/web/packages/AeRobiology/index.html) emerges as a potent instrument tailored for managing diverse airborne particle data. As the prevalence of airborne pollen-related challenges intensifies, understanding the nuanced temporal trends in different pollen types becomes imperative. AeRobiology not only addresses data quality concerns but also offers specialized tools for unraveling intricate insights into the temporal dynamics of various pollen types. Introduction Amidst the complexities of environmental research, particularly in the context of health studies, the meticulous analysis of airborne particles—specifically various pollen types—takes center stage. This workflow, harnessing the capabilities of AeRobiology, adopts a holistic approach to process and analyze time-series data. Focused on deciphering the temporal nuances of pollen seasons, this workflow aims to significantly contribute to our understanding of the temporal dynamics of different airborne particle types. Aims The primary objectives of this workflow are tailored to address specific challenges and goals inherent in the analysis of time series pollen samples: - Holistic Data Quality Assurance: Conduct a detailed examination of time-series data for various pollen types, ensuring completeness and accuracy to establish a robust foundation for subsequent analysis. - Pollen-Specific Seasonal Profiling: Leverage AeRobiology's advanced features to calculate and visually represent key parameters of the seasonal trends for different pollen types, offering a comprehensive profile of their temporal dynamics. - Temporal Dynamics Exploration: Investigate the temporal trends in concentrations of various pollen types, providing valuable insights into their evolving nature over time. - Enhanced Accessibility: Employ AeRobiology's interactive tools to democratize the exploration of time-series data, making complex information accessible to a broader audience of researchers and professionals. Scientific Questions This workflow addresses critical scientific questions related to pollen analysis: - Distinct Temporal Signatures: What are the discernible patterns and trends in the temporal dynamics of different airborne pollen types, especially during peak seasons? - Pollen-Specific Abundance Variability: How does the abundance of various pollen types vary throughout their respective seasons, and what environmental factors contribute to these fluctuations? - Meteorological Correlations: Are there statistically significant correlations between the concentrations of different pollen types and specific meteorological variables, elucidating the influencing factors unique to each type? - Cross-Annual Comparative Analysis: Through the lens of AeRobiology, how do the temporal trends of different pollen types compare across different years, and what contextual factors might explain observed variations?