workflow
Type of resources
Keywords
Contact for the resource
Years
status
Groups
-
This workflow aims to streamline the integration of phytosociological inventory data stored in multiple XML files within a ZIP archive into a MongoDB database. This process is crucial for effective data management within the project's Virtual Research Environment (VRE). The workflow involves extracting XML files from the ZIP archive, converting them to JSON format for MongoDB compatibility, checking for duplicates to ensure data integrity, and uploading the data to increment the inventory count. This enhances the robustness and reliability of the inventory dataset for comprehensive analysis. Background Efficient data management is crucial in phytosociological inventories, necessitating seamless integration of inventory data. This workflow addresses a key aspect by facilitating the importation of phytosociological inventories stored in multiple XML files within a ZIP archive into the MongoDB database. This integration is vital for the project's Virtual Research Environment (VRE), providing a foundation for robust data analysis. The workflow comprises two essential components: converting XML to JSON and checking for inventory duplicates, ultimately enhancing the integrity and expansiveness of the inventory database. Introduction In phytosociological inventories, effective data handling is paramount, particularly concerning the integration of inventory data. This workflow focuses on the pivotal task of importing phytosociological inventories, stored in multiple XML files within a ZIP archive, into the MongoDB database. This process is integral to the VRE of the project, laying the groundwork for comprehensive data analysis. The workflow's primary goal is to ensure a smooth and duplicate-free integration, promoting a reliable dataset for further exploration and utilization within the project's VRE. Aims The primary aim of this workflow is to streamline the integration of phytosociological inventory data, stored in multiple XML files within a ZIP archive, into the MongoDB database. This ensures a robust and duplicate-free dataset for further analysis within the project's VRE. To achieve this, the workflow includes the following key components: - ZIP Extraction and XML to JSON Conversion: Extracts XML files from the ZIP archive and converts each phytosociological inventory stored in XML format to JSON, preparing the data for MongoDB compatibility. - Duplicate Check and Database Upload: Checks for duplicate inventories in the MongoDB database and uploads the JSON files, incrementing the inventory count in the database. Scientific Questions - ZIP Archive Handling: How effectively does the workflow handle ZIP archives containing multiple XML files with distinct phytosociological inventories? - Data Format Compatibility: How successful is the conversion of XML-based phytosociological inventories to the JSON format for MongoDB integration? - Database Integrity Check: How effective is the duplicate check component in ensuring data integrity by identifying and handling duplicate inventories? - Inventory Count Increment: How does the workflow contribute to the increment of the inventory count in the MongoDB database, and how is this reflected in the overall project dataset?
-
This workflow employs a deep learning model for blind spectral unmixing, avoiding the need for expensive hyperspectral data. The model processes 224x224 pixel RGB images and associated environmental data to generate CSV files detailing LULC abundance at two levels of detail (N1 and N2). The aim is to provide an efficient tool for LULC monitoring, answering the question: Can LULC abundance be estimated from RGB images and environmental data? This framework supports environmental monitoring and land cover analysis. Background Land Use and Land Cover (LULC) represents earth surface biophysical properties of natural or human origin, such as forests, water bodies, agricultural fields, or urban areas. Often, different LULC types are mixed together in the same analyzed area. Nowadays, spectral imaging sensors allow us to capture these mixed LULC types (i.e., endmembers) together as different spectral data signals. LULC types identification within a spectral mixture (i.e., endmembers identification) and their quantitative abundance assessment (i.e., endmembers abundances estimation) play a key role in understanding earth surface transformations and climate change effects. These two tasks are carried out through spectral unmixing algorithms by which the measured spectrum of a mixed image is decomposed into a collection of constituents (i.e., spectra, or endmembers), and a set of fractions indicating their abundances. Introduction Early research on spectral unmixing dates back more than three decades. First attempts, referred to as linear unmixing, assumed that the spectral response recorded for an LULC mixture is simply an additive function of the spectral response of each class weighted by its proportional coverage. Notably, some authors used linear regression and similar linear mixture-based techniques in order to relate the spectral response to its class composition. Afterwards, other authors claimed the necessity of overcoming this assumption by proposing non-linear unmixing methods. However, non-linear methods require endmember spectra extraction for each LULC class, which has been found difficult in several works. Moreover, some studies indicated that it is unlikely that the spectra could be derived directly from the remotely sensed data since the majority of image pixels may be mixed. To overcome these limitations, several works introduced what is called blind spectral unmixing as an alternative method to avoid the need to derive any endmember spectra or making any prior assumption about their mixing nature. However, the majority of works that adopted blind spectral unmixing used deep learning-based models trained with expensive and hard-to-process hyperspectral or multispectral images. Therefore, many researchers during the last decade pointed out that more effort should be dedicated towards the usage of more affordable remote sensing data with few bands in spectral unmixing. They justified this need by two important factors: (1) In real situations, we might have access to images with only a few bands because of their availability, cost-effectiveness, and acquisition time-efficiency in comparison to imagery gathered with multi-band devices that require more processing effort and expenses; (2) In some cases, we do not really need a huge number of bands, as they can be used as a fundamental dataset from which we determine optimal wavebands for a particular application. In parallel, high-quality research in artificial intelligence application to remote sensing imagery, such as computer vision-based techniques and especially DL, is continuously achieving new breakthroughs that encourage researchers to entrust remote sensing imagery analysis tasks to these models and be confident about their performance. Aims The objective of this work is to present what is to our knowledge the first study that explores a multi-task deep learning approach for blind spectral unmixing using only 224x224 pixels RGB images derived from Sentinel-2 and enriched with their corresponding environmental ancillary data (topographic and climatic ancillary data) without the need to use any expensive and complex hyperspectral or multispectral data. The proposed deep learning model used in this study is trained in a multi-task learning approach (MTL) as it constitutes the most adequate machine learning method that aims to combine several information from different tasks to improve the performance of the model in each specific task, motivated by the idea that different tasks can share common feature representations. Thus, the provided model in this workflow was optimized for elaborating endmembers abundance estimation task that aims to quantify the spatial percentage covered by each LULC type within the analyzed RGB image, while being trained for other spectral unmixing related tasks that improves its accuracy in the main targeted task which is endmembers abundance estimation. The provided model here is able to give for each input (RGB image + ancillary data) the contained endmembers abundances values inside its area summarized in an output CSV file. The results can be computed for two different levels N1 and N2. These two levels reflect two land use/cover levels definitions in SIPNA land use/cover mapping campaign (Sistema de Información sobre el Patrimonio Natural de Andalucía) which aims to build an information system on the natural heritage of Andalusia in Spain (https://www.juntadeandalucia.es/medioambiente/portal/landing-page-%C3%ADndice/-/asset_publisher/zX2ouZa4r1Rf/content/sistema-de-informaci-c3-b3n-sobre-el-patrimonio-natural-de-andaluc-c3-ada-sipna-/20151). The first level "N1" contains four high-level LULC classes, whereas the second level "N2" contains ten finer level LULC classes. Thus, this model was mainly trained and validated on the region of Andalusia in Spain. Scientific Questions Through the development of this workflow, we aim at addressing the following main scientific question: - Can we estimate the abundance of each land use/land cover type inside an RGB satellite image using only the RGB image and the environmental ancillary data corresponding to the area covered by this image?
-
This workflow aims to compare plant species across different natural spaces. The workflow involves downloading and filtering phytosociological inventories, preprocessing data, and unifying it for comparative analysis. The main outputs are a Venn diagram displaying shared and unique species, and a CSV table detailing common and uncommon species. The workflow addresses filter application effectiveness, Venn diagram clarity, species table accuracy, and overall efficiency in processing and visualization, supporting ecological studies of plant distribution. Background Comparative analysis of phytosociological inventories across different natural spaces is essential for understanding plant distribution. This workflow focuses on downloading inventories stored in the database, applying distinct filters for each natural space, and conducting a comparative analysis of shared and unique plant species. The primary output includes a Venn diagram representing species intersections and a CSV table detailing common and uncommon plant species across the selected natural spaces. Introduction In ecological studies, understanding the overlap and uniqueness of plant species across different natural spaces is crucial. This workflow employs phytosociological inventories stored in the database, downloading them separately for each natural space using specific filters. The workflow then conducts a comparative analysis, identifying shared and unique plant species. The visualization includes a Venn diagram for easy interpretation and a CSV table highlighting the common and uncommon species across the selected natural spaces. Aims The primary aim of this workflow is to facilitate the comparison of phytosociological inventories from different natural spaces, emphasizing shared and unique plant species. The workflow includes the following key components: - Inventory Download and Preprocessing: Downloads phytosociological inventories from the database, applies specific filters for each natural space, and preprocesses the data to retain only the species present in each zone. - Data Unification: Unifies the processed data into a single dataset, facilitating comparative analysis. - Venn Diagram Representation: Generates a Venn diagram to visually represent the overlap and uniqueness of plant species across the selected natural spaces. - Species Table Generation: Creates a CSV table showcasing common and uncommon plant species in the selected natural spaces. Scientific Questions - Filter Application Effectiveness: How effectively does the workflow apply distinct filters to download inventories for each natural space? - Venn Diagram Interpretation: How intuitive and informative is the Venn diagram representation of shared and unique plant species across the selected natural spaces? - Species Table Accuracy: How accurate is the CSV table in presenting common and uncommon plant species in the comparative analysis? - Workflow Efficiency: How efficiently does the workflow streamline the entire process, from data download to visualization, for comparative phytosociological analysis?
-
This workflow aims to streamline the integration of phytosociological inventory data from Word documents (.docx) into a MongoDB database. This process is essential for the project's Virtual Research Environment (VRE), facilitating robust data analysis. Key components include converting Word documents to JSON format, checking for duplicate inventories to ensure data integrity, and uploading the JSON files to the database. This workflow ensures a reliable, comprehensive dataset for further exploration and utilization within the VRE, enhancing the project's inventory database. Background Efficient data management is crucial in phytosociological inventories, necessitating seamless integration of inventory data. This workflow addresses a key aspect by facilitating the importation of phytosociological inventories stored in Word documents (.docx) into the MongoDB database. This integration is vital for the project's Virtual Research Environment (VRE), providing a foundation for robust data analysis. The workflow comprises two essential components: converting Word to JSON and checking for inventory duplicates, ultimately enhancing the integrity and expansiveness of the inventory database. Introduction In phytosociological inventories, effective data handling is paramount, particularly concerning the integration of inventory data. This workflow focuses on the pivotal task of importing phytosociological inventories, stored in Word documents (.docx), into the MongoDB database. This process is integral to the VRE of the project, laying the groundwork for comprehensive data analysis. The workflow's primary goal is to ensure a smooth and duplicate-free integration, promoting a reliable dataset for further exploration and utilization within the project's VRE. Aims The primary aim of this workflow is to streamline the integration of phytosociological inventory data, stored in Word documents (.docx), into the MongoDB database. This ensures a robust and duplicate-free dataset for further analysis within the project's VRE. To achieve this, the workflow includes the following key components: - Word to JSON Conversion: Converts phytosociological inventories stored in Word documents (.docx) to JSON, preparing the data for MongoDB compatibility. - Duplicate Check and Database Upload: Checks for duplicate inventories in the MongoDB database and uploads the JSON files, incrementing the inventory count in the database. Scientific Questions - Word Document Parsing: How effectively does the workflow parse and extract phytosociological inventories from Word documents (.docx)? - Data Format Compatibility: How successful is the conversion of Word-based phytosociological inventories to the JSON format for MongoDB integration? - Database Integrity Check: How effective is the duplicate check component in ensuring data integrity by identifying and handling duplicate inventories? - Inventory Count Increment: How does the workflow contribute to the increment of the inventory count in the MongoDB database, and how is this reflected in the overall project dataset?
-
This workflow integrates the MEDA Toolbox for Matlab and Octave, focusing on data simulation, Principal Component Analysis (PCA), and result visualization. Key steps include simulating multivariate data, applying PCA for data modeling, and creating interactive visualizations. The MEDA Toolbox combines traditional and advanced methods, such as ANOVA Simultaneous Component Analysis (ASCA). The aim is to integrate the MEDA Toolbox into LifeWatch, providing tools for enhanced data analysis and visualization in research. Background This workflow is a template for the integration of the Multivariate Exploratory Data Analysis Toolbox (MEDA Toolbox, https://github.com/codaslab/MEDA-Toolbox) in LifeWatch. The MEDA Toolbox for Matlab and Octave is a set of multivariate analysis tools for the exploration of data sets. There are several alternative tools in the market for that purpose, both commercial and free. The PLS_Toolbox from Eigenvector Inc. is a very nice example. The MEDA Toolbox is not intended to replace or compete with any of these toolkits. Rather, the MEDA Toolbox is a complementary tool that includes several contributions of the Computational Data Science Laboratory (CoDaS Lab) to the field of data analysis. Thus, traditional exploratory plots based on Principal Component Analysis (PCA) or Partial Least Squares (PLS), such as score, loading, and residual plots, are combined with new methods: MEDA, oMEDA, SVI plots, ADICOV, EKF & CKF cross-validation, CSP, GPCA, etc. A main tool in the MEDA Toolbox which has received a lot of attention lately is ANOVA Simultaneous Component Analysis (ASCA). The ASCA code in the MEDA Toolbox is one of the most advanced internationally. Introduction The workflow integrates three examples of functionality within the MEDA Toolbox. First, there is a data simulation step, in which a matrix of random data is simulated with a user-defined correlation level. The output is sent to a modeling step, in which Principal Component Analysis (PCA) is computed. The PCA model is then sent to a visualization module. Aims The main goal of this template is the integration of the MEDA Toolbox in LifeWatch, including data simulation, data modeling, and data visualization routines. Scientific Questions This workflow only exemplifies the integration of the MEDA Toolbox. No specific questions are addressed.
-
Background Biological invasions are acknowledged to be significant environmental and economic threats, yet the identification of key ecological traits determining invasiveness of species has remained elusive. One unappreciated source of variation concerns dietary flexibility of non-native species and their ability to shift trophic position within invaded food webs. Trophic plasticity may greatly influence invasion success as it facilitates colonisation, adaptation, and successful establishment of non-native species into new territories. In addition, having a flexible diet gives the introduced species a better chance to become invasive and, as a consequence, to have a strong impact on food webs, determining secondary disruptions such as trophic cascades and changes in energy fluxes. The deleterious effects can affect multiple trophic levels. Introduction Crustaceans are considered the most successful taxonomic group of aquatic invaders worldwide. Their ability to colonise and easily adapt to new ecosystems can be ascribed to a number of ecological features including their omnivorous feeding behaviour. This validation case study focuses on two invasive crustaceans widely distributed in marine and freshwater European waters: the Atlantic blue crab Callinectes sapidus and the Louisiana crayfish or red swamp crayfish Procambarus clarkii. Callinectes sapidus and Procambarus clarkii are opportunistic omnivores that feed on a variety of food sources from detritus to plants and invertebrates. For this reason, they represent a good model to investigate the variation of trophic niches in invaded food webs and their ecological impact on native communities. The ecological consequences of the invasion and establishment of these invasive crustaceans can vary from modification of carbon cycles in benthic food webs to regulation of prey/predator abundance through bottom-up and top-down interactions. Understanding how the trophic ecology of these invasive crustaceans shapes benthic food webs in invaded ecosystems is crucial for an accurate assessment of their impact. The analysis of stable isotopes can provide important clues on the trophic effects of invasive species within non-native ecosystems by evaluating changes in their trophic position and characteristics of their trophic niche. Aims This validation case uses a collection of stable isotopes (δ13C and δ15N) of C. sapidus and P. clarkii and their potential prey in invaded food webs to quantify changes in the trophic position of the invaders and to assess post-invasion shifts in their dietary habits. This case study additionally evaluates the main environmental drivers involved in trophic niche adaptations and whether such bioclimatic predictors influence broad-scale patterns of variation in the trophic position of the invader.
-
This workflow aims to streamline the integration of phytosociological inventory data stored in Excel format into a MongoDB database. This process is essential for the project's Virtual Research Environment (VRE), facilitating comprehensive data analysis. Key components include converting Excel files to JSON format, checking for duplicate inventories to ensure data integrity, and uploading the JSON files to the database. This workflow promotes a reliable, robust dataset for further exploration and utilization within the VRE, enhancing the project's inventory database. Background Efficient data management in phytosociological inventories requires seamless integration of inventory data. This workflow facilitates the importation of phytosociological inventories in Excel format into the MongoDB database, connected to the project's Virtual Research Environment (VRE). The workflow comprises two components: converting Excel to JSON and checking for inventory duplicates, ultimately enhancing the inventory database. Introduction Phytosociological inventories demand efficient data handling, especially concerning the integration of inventory data. This workflow focuses on the pivotal task of importing phytosociological inventories, stored in Excel format, into the MongoDB database. This process is integral to the VRE of the project, laying the groundwork for comprehensive data analysis. The workflow's primary goal is to ensure a smooth and duplicate-free integration, promoting a reliable dataset for further exploration and utilization within the project's VRE. Aims The primary aim of this workflow is to streamline the integration of phytosociological inventory data into the MongoDB database, ensuring a robust and duplicate-free dataset for further analysis within the project's VRE. To achieve this, the workflow includes the following key components: 1. Excel to JSON Conversion: Converts phytosociological inventories stored in Excel format to JSON, preparing the data for MongoDB compatibility. 2. Duplicate Check and Database Upload: Checks for duplicate inventories in the MongoDB database and uploads the JSON file, incrementing the inventory count in the database. Scientific Questions - Data Format Compatibility: How effectively does the workflow convert Excel-based phytosociological inventories to the JSON format for MongoDB integration? - Database Integrity Check: How successful is the duplicate check component in ensuring data integrity by identifying and handling duplicate inventories? - Inventory Count Increment: How does the workflow contribute to the increment of the inventory count in the MongoDB database, and how is this reflected in the overall project dataset?
-
Accurately mapping vegetation is crucial for environmental monitoring. Traditional methods for identifying shrubs are labor-intensive and impractical for large areas. This workflow uses remote sensing and deep learning to detect Juniperus shrubs from high-resolution RGB satellite images, making shrub identification more efficient and accessible to non-experts in machine learning. Background In a dynamic climate, accurately mapping vegetation distribution is essential for environmental monitoring, biodiversity conservation, forestry, and urban planning. One important application of vegetation mapping is the identification of shrub individuals. We term by shrub identification, detection of shrub location and segmentation of shrub morphology. Introduction Yet, shrub species monitoring is a challenging task. Ecologists used to identify shrubs using classical field surveying methods, however, this process poses a significant challenge since the shrubs are often distributed in large areas that are most of the time inaccessible. Thus, these methods are considered labor-intensive, costly, time-consuming, unsustainable, limited to a small spatial and temporal scale, and their data are often not publicly available. Combining remote sensing and deep learning, however, can play a significant role in tackling these challenges providing a great opportunity to improve plant surveying. First, remote sensing can offer highly detailed spatial resolution granting exceptional flexibility in data acquisition. Then, these data can be afterward processed by deep learning models for automatic identification of shrubs. Aims The objective of this workflow is to help scientists, non-expert in machine learning, detect Juniperus shrubs from RGB very-high resolution satellite images using deep learning and remote sensing tools. Scientific Questions Can we accurately detect high-mountain Juniperus shrubs from RGB very-high resolution satellite images using deep learning?
-
Background Ailanthus altissima is one of the worst invasive plants in Europe. It reproduces both by seeds and asexually through root sprouting. The winged seeds can be dispersed by wind, water and machinery, while its robust root system can generate numerous suckers and cloned plants. In this way, Ailanthus altissima typically occurs in very dense clumps, but can also occasionally grow as widely spaced or single stems. This highly invasive plant can colonise a wide range of anthropogenic and natural sites, from stony and sterile soils to rich alluvial bottoms. Due to its vigour, rapid growth, tolerance, adaptability and lack of natural enemies, it spreads spontaneously, out-competing other plants and inhibiting their growth Introduction Over the last few decades, Ailanthus altissima has quickly spread in the Alta Murgia National Park (Southern Italy) which is mostly characterized by dry grassland and pseudo-steppe, wide-open spaces with low vegetation, which are very vulnerable to invasion. Ailanthus altissima causes serious direct and indirect damages to ecosystems, replacing and altering communities that have great conservation value, producing severe ecological, environmental and economic effects, and causing natural habitat loss and degradation. The spread of Ailanthus altissima is likely to increase in the future, unless robust action is taken at all levels to control its expansion. In a recent working document of the European Commission, it was found that the cost of controlling and eliminating invasive species in Europe amounts to €12 billion per year. Two relevant questions then arise: i) whether it is possible or not to fully eradicate or, at least, to reduce the impact of an invasive species and ii) how to achieve this at a minimum cost, in terms of both environmental damage and economic resources. A Life Program funded the Life Alta Murgia project (LIFE12BIO/IT/000213) had, as its main objective, the eradication of this invasive exotic tree species from the Alta Murgia National Park. That project provided both the expert knowledge and valuable in-field data for the Ailanthus validation case study, which was conceived and developed within the Internal Joint Initiative of LifeWatch ERIC. Aims At the start of the on-going eradication program a single map of A. altissima was available, dating back to 2012. Due to the lack of data, predicting the extent of invasion and its impacts was extremely difficult, making it impossible to assess the efficacy of control measures. Static models based on statistics cannot predict spatial–temporal dynamics (e.g. where and when A. altissima may repopulate an area), whereas mechanistic models incorporating the growth and spread of a plant would require precise parametrisation, which was extremely difficult with the scarce information available. To overcome these limitations, a relatively simple mechanistic model has been developed, a diffusion model, which is validated against the current spatial distribution of the plant estimated by satellite images. This model accounts for the effect of eradication programs by using a reaction term to estimate the uncertainty of the prediction. This model provides an automatic tool to estimate a-priori the effectiveness of a planned control action under temporal and budget constraints. This robust tool can be easily applied to other geographical areas and, potentially, to different species.
-
This workflow focuses on enhancing spatial interpolation of meteorological variables by incorporating altitude. The process involves importing geolocated meteorological data from shapefiles, downloading a Digital Elevation Model (DEM) for elevation data, and utilizing 3D Kriging for interpolation. This method improves the accuracy of meteorological data interpolation across various elevations, providing comprehensive spatial coverage. Key components include precise data import, effective DEM integration, and accurate 3D Kriging, addressing scientific questions about data import precision, DEM integration, Kriging accuracy, and spatial coverage enhancement. Background Interpolating geolocated meteorological variables is crucial for obtaining comprehensive insights into environmental conditions. This workflow, comprising three components, focuses on importing shapefile data containing geolocated meteorological variables. The primary objective is to perform a 3D interpolation, considering altitude as a significant factor. To achieve this, the workflow downloads a Digital Elevation Model (DEM) to incorporate elevation information and utilizes 3D Kriging for interpolation. Introduction Interpolating meteorological variables in geospatial datasets is essential for understanding environmental conditions. This workflow aims to enhance the accuracy of such interpolations by importing shapefile data, obtaining elevation data from a DEM, and performing a 3D interpolation using Kriging. The resulting dataset provides interpolated meteorological values for locations not covered by the original sampling. Aims The primary aim of this workflow is to achieve accurate 3D interpolation of meteorological variables, considering altitude, to enhance spatial coverage. The workflow includes the following key components: ∙Shapefile Data Import: Imports geolocated meteorological variables from a shapefile, preparing the data for 3D interpolation. ∙Digital Elevation Model (DEM) Download: Downloads a Digital Elevation Model (DEM) to obtain elevation information for the interpolation process. ∙3D Kriging Interpolation: Utilizes 3D Kriging to interpolate meteorological variables, incorporating altitude information for enhanced accuracy. Scientific questions ∙Data Import Precision: How precise is the workflow in importing geolocated meteorological variables from the shapefile data? ∙DEM Download and Integration: How effectively does the workflow download the DEM and integrate elevation information into the interpolation process? ∙3D Kriging Accuracy: How accurate is the 3D Kriging interpolation in providing reliable meteorological values, considering altitude as a key factor? ∙Enhancement of Spatial Coverage: To what extent does the 3D interpolation process enhance spatial coverage, providing interpolated values for locations not originally sampled?