From 1 - 10 / 24
  • Background Freshwater ecosystems have been profoundly affected by habitat loss, degradation, and overexploitation, leaving them now especially vulnerable to biological invasions. Whether non-indigenous species are the key drivers or mere complementary factors of biodiversity loss is still debated among the scientific community, however biological invasions together with other anthropogenic stressors are determining population declines and homogenisation of biodiversity in freshwater ecosystems worldwide. For example, it has been demonstrated that river basins with greater numbers of non-indigenous species have higher extinction rates of native fish species. Consequently, the application of effective biomonitoring approaches to support protection actions of managers, stakeholders and policy-makers is nowadays essential. Introduction Conventional methods of monitoring freshwater fish diversity are based on direct observation of organisms and are therefore costly, labour and resource intensive, require taxonomic expertise, and can be invasive. Obtaining information about species and communities by retrieving DNA from environmental samples has the ability to overcome some of these difficulties. The molecular investigation of environmental samples is known as environmental DNA (eDNA). Environmental DNA can be isolated from water, soil, air or faeces as organisms shed their genetic material in the surroundings through metabolic waste, damaged tissues, sloughed skin cells and decomposition. The analysis of eDNA consists of extracting the genetic material and subjecting it to a Polymerase Chain Reaction (PCR) which amplifies the target DNA. The use of high-throughput sequencing (HTS) allows the simultaneous identification of many species within a certain taxonomic group. This community-wide approach is known as eDNA metabarcoding and involves the use of broad-range primers during PCR that amplify a set of species. In recent years, the cost of this technology has drastically decreased, making it very attractive in conservation management and scientific research. A number of studies have demonstrated that eDNA metabarcoding is more sensitive than conventional biomonitoring methods for freshwater fish as it can detect rare or low-abundance taxa. As a result, eDNA metabarcoding can be used as an early-warning tool to detect new NIS at the initial stages of colonisation, when they are not yet abundant in the ecosystem. Aims This validation case regards eDNA metabarcoding fish sequences collected from the Douro Basin in Portugal. DNA sequences are processed through a bioinformatic pipeline wrapped in the first part of the analytical workflow which conducts a quality check and assigns the DNA sequences to produce a list of taxa. The analytical workflow developed can process DNA sequences of different kinds, depending on the genetic markers used for the analysis and so this workflow can be applied to different taxonomic groups and ecosystems. The taxa identified might include indigenous organisms as well as newly identified taxa within a certain geographical region. For that reason, the national checklists of introduced and invasive species (GRISS) from GBIF are consulted to check if the organisms detected are recognised as NIS or if previously unrecorded NIS have been detected through eDNA metabarcoding analysis.

  • The Ecological Niche Modelling Workflows offer an extensible framework for analyzing or predicting the impact of environmental changes on the distribution of biodiversity. Especially in combination with data aggregation workflows like the Taxonomic Data Refinement Workflow, the Ecological Niche Modelling workflows facilitate the analysis of species distribution patterns over large geo-temporal, taxonomic, and environmental scales. Examples for applications are studies of species adaptations to climate change, dynamic modeling of ecologically related species, identification of regions with accumulated risk for invasion, potential for restoration, or natural protected areas. Developed by: The Biodiversity Virtual e-Laboratoy (BioVeL) (EU FP7 project) Technology or platform: These workflows have been developed to be run in the Taverna automated workflow environment (https://incubator.apache.org/projects/taverna.html). In their current form, the workflow files (with the .t2flow extension) can be loaded and executed in the workbench variant of Taverna. They have been tested with Taverna Workbench version 2.4. These workflows can also be run in BioVeL Portal, a light weight user interface which allows browsing, reviewing and running Taverna Workflows without the need of installing any software.

  • Background Monitoring hard-bottom marine biodiversity can be challenging as it often involves non-standardised sampling methods that limit scalability and inter-comparison across different monitoring approaches. Therefore, it is essential to implement standardised techniques when assessing the status of and changes in marine communities, in order to give the correct information to support management policy and decisions, and to ensure the most appropriate level of protection for the biodiversity in each ecosystem. Biomonitoring methods need to comply with a number of criteria including the implementation of broadly accepted standards and protocols and the collection of FAIR data (Findable, Accessible, Interoperable, and Reusable). Introduction Artificial substrates represent a promising tool for monitoring community assemblages of hard-bottom habitats with a standardised methodology. The European ARMS project is a long-term observatory network in which about 20 institutions distributed across 14 European countries, including Greenland and Antarctica, collaborate. The network consists of Autonomous Reef Monitoring Structures (ARMS) which are deployed in the proximity of marine stations and Long-term Ecological Research sites. ARMS units are passive monitoring systems made of stacked settlement plates that are placed on the sea floor. The three-dimensional structure of the settlement units mimics the complexity of marine substrates and attracts sessile and motile benthic organisms. After a certain period of time these structures are brought up, and visual, photographic, and genetic (DNA metabarcoding) assessments are made of the lifeforms that have colonised them. These data are used to systematically assess the status of, and changes in, the hard-bottom communities of near-coast ecosystems. Aims ARMS data are quality controlled and open access, and they are permanently stored (Marine Data Archive) along with their metadata (IMIS, catalogue of VLIZ) ensuring data fairness. Data from ARMS observatories provide a promising early-warning system for marine biological invasions by: i) identifying newly arrived Non-Indigenous Species (NIS) at each ARMS site; ii) tracking the migration of already known NIS in European continental waters; iii) monitoring the composition of hard-bottom communities over longer periods; and iv) identifying the Essential Biodiversity Variables (EBVs) for hard-bottom fauna, including NIS. The ARMS validation case was conceived to achieve these objectives: a data-analysis workflow was developed to process raw genetic data from ARMS; end-users can select ARMS samples from the ever-growing number available in collection; and raw DNA sequences are analysed using a bioinformatic pipeline (P.E.M.A.) embedded in the workflow for taxonomic identification. In the data-analysis workflow, the correct identification of taxa in each specific location is made with reference to WoRMS and WRiMS, webservices that are used to check respectively the identity of the organisms and whether they are introduced.

  • The aim of the (Taxonomic) Data Refinement Workflow is to provide a streamlined workflow environment for preparing observational and specimen data sets for use in scientific analysis on the Taverna platform. The workflow has been designed in a way that, accepts input data in a recognized format, but originating from various sources (e.g. services, local user data sets), includes a number of graphical user interfaces to view and interact with the data, the output of each part of the workflow is compatible with the input of each part, implying that the user is free to choose a specific sequence of actions, allows for the use of custom-built as well as third-party tools applications and tools. This workflow can be accessed through the BioVeL Portal here http://biovelportal.vliz.be/workflows?category_id=1 This workflow can be combined with the Ecological Niche Modelling Workflows. http://marine.lifewatch.eu/ecological-niche-modelling Developed by: Biodiversity Virtual e-Laboratory (BioVeL) (EU FP7 project) Technology or platform: The workflow has been developed to be run in the Taverna automated workflow environment.

  • Background Ailanthus altissima is one of the worst invasive plants in Europe. It reproduces both by seeds and asexually through root sprouting. The winged seeds can be dispersed by wind, water and machinery, while its robust root system can generate numerous suckers and cloned plants. In this way, Ailanthus altissima typically occurs in very dense clumps, but can also occasionally grow as widely spaced or single stems. This highly invasive plant can colonise a wide range of anthropogenic and natural sites, from stony and sterile soils to rich alluvial bottoms. Due to its vigour, rapid growth, tolerance, adaptability and lack of natural enemies, it spreads spontaneously, out-competing other plants and inhibiting their growth Introduction Over the last few decades, Ailanthus altissima has quickly spread in the Alta Murgia National Park (Southern Italy) which is mostly characterized by dry grassland and pseudo-steppe, wide-open spaces with low vegetation, which are very vulnerable to invasion. Ailanthus altissima causes serious direct and indirect damages to ecosystems, replacing and altering communities that have great conservation value, producing severe ecological, environmental and economic effects, and causing natural habitat loss and degradation. The spread of Ailanthus altissima is likely to increase in the future, unless robust action is taken at all levels to control its expansion. In a recent working document of the European Commission, it was found that the cost of controlling and eliminating invasive species in Europe amounts to €12 billion per year. Two relevant questions then arise: i) whether it is possible or not to fully eradicate or, at least, to reduce the impact of an invasive species and ii) how to achieve this at a minimum cost, in terms of both environmental damage and economic resources. A Life Program funded the Life Alta Murgia project (LIFE12BIO/IT/000213) had, as its main objective, the eradication of this invasive exotic tree species from the Alta Murgia National Park. That project provided both the expert knowledge and valuable in-field data for the Ailanthus validation case study, which was conceived and developed within the Internal Joint Initiative of LifeWatch ERIC. Aims At the start of the on-going eradication program a single map of A. altissima was available, dating back to 2012. Due to the lack of data, predicting the extent of invasion and its impacts was extremely difficult, making it impossible to assess the efficacy of control measures. Static models based on statistics cannot predict spatial–temporal dynamics (e.g. where and when A. altissima may repopulate an area), whereas mechanistic models incorporating the growth and spread of a plant would require precise parametrisation, which was extremely difficult with the scarce information available. To overcome these limitations, a relatively simple mechanistic model has been developed, a diffusion model, which is validated against the current spatial distribution of the plant estimated by satellite images. This model accounts for the effect of eradication programs by using a reaction term to estimate the uncertainty of the prediction. This model provides an automatic tool to estimate a-priori the effectiveness of a planned control action under temporal and budget constraints. This robust tool can be easily applied to other geographical areas and, potentially, to different species.

  • This workflow aims to compare plant species across different natural spaces. The workflow involves downloading and filtering phytosociological inventories, preprocessing data, and unifying it for comparative analysis. The main outputs are a Venn diagram displaying shared and unique species, and a CSV table detailing common and uncommon species. The workflow addresses filter application effectiveness, Venn diagram clarity, species table accuracy, and overall efficiency in processing and visualization, supporting ecological studies of plant distribution. Background Comparative analysis of phytosociological inventories across different natural spaces is essential for understanding plant distribution. This workflow focuses on downloading inventories stored in the database, applying distinct filters for each natural space, and conducting a comparative analysis of shared and unique plant species. The primary output includes a Venn diagram representing species intersections and a CSV table detailing common and uncommon plant species across the selected natural spaces. Introduction In ecological studies, understanding the overlap and uniqueness of plant species across different natural spaces is crucial. This workflow employs phytosociological inventories stored in the database, downloading them separately for each natural space using specific filters. The workflow then conducts a comparative analysis, identifying shared and unique plant species. The visualization includes a Venn diagram for easy interpretation and a CSV table highlighting the common and uncommon species across the selected natural spaces. Aims The primary aim of this workflow is to facilitate the comparison of phytosociological inventories from different natural spaces, emphasizing shared and unique plant species. The workflow includes the following key components: - Inventory Download and Preprocessing: Downloads phytosociological inventories from the database, applies specific filters for each natural space, and preprocesses the data to retain only the species present in each zone. - Data Unification: Unifies the processed data into a single dataset, facilitating comparative analysis. - Venn Diagram Representation: Generates a Venn diagram to visually represent the overlap and uniqueness of plant species across the selected natural spaces. - Species Table Generation: Creates a CSV table showcasing common and uncommon plant species in the selected natural spaces. Scientific Questions - Filter Application Effectiveness: How effectively does the workflow apply distinct filters to download inventories for each natural space? - Venn Diagram Interpretation: How intuitive and informative is the Venn diagram representation of shared and unique plant species across the selected natural spaces? - Species Table Accuracy: How accurate is the CSV table in presenting common and uncommon plant species in the comparative analysis? - Workflow Efficiency: How efficiently does the workflow streamline the entire process, from data download to visualization, for comparative phytosociological analysis?

  • This workflow streamlines the export, preprocessing, and analysis of phytosociological inventories from a project database. The workflow's goals include exporting and preprocessing inventories, conducting statistical analyses, and using interactive graphs to visualize species dominance, altitudinal distribution, average coverage, similarity clusters, and species interactions. It also calculates and visualizes the fidelity index for species co-occurrence. This workflow addresses key scientific questions about dominant species, distribution patterns, species coverage, inventory similarity, species interactions, and co-occurrence probabilities, aiding efficient vegetation management in environmental projects. Background Efficient vegetation management in environmental projects necessitates a detailed analysis of phytosociological inventories. This workflow streamlines the export and preprocessing of vegetation inventories from the project database. Subsequently, it conducts various statistical analyses and graphical representations, offering a comprehensive view of plant composition and interactions. Introduction In the realm of vegetation research, the availability of phytosociological data is paramount. This workflow empowers users to specify parameters for exporting vegetation inventories, performs preprocessing, and conducts diverse statistical analyses. The resulting insights are visually represented through interactive graphs, highlighting predominant species, altitudinal ranges of plant communities, average species coverage, similarity clusters, and interactive species interactions. Aims The primary objectives of this workflow are tailored to address specific challenges and goals inherent in the analysis of phytosociological inventories: 1. Export and Preprocess Inventories: Enable the export and preprocessing of phytosociological inventories stored in the project database. 2. Statistical Analyses of Species and Plant Communities: Conduct detailed statistical analyses on the species and plant communities present in the inventories. 3. Interactive Graphical Representation: Utilize interactive graphs to represent predominant species, altitudinal ranges of plant communities, and average species coverage. 4. Similarity Dendrogram: Generate a dendrogram grouping similar phytosociological inventories based on the similarity of their species content. 5. Interactive Species Interaction Analysis: Visualize species interactions through interactive graphs, facilitating the identification of species that tend to coexist. 6. Calculation and Visualization of Fidelity Index: Calculate the fidelity index between species and visually represent the probability of two or more species co-occurring in the same inventory. Scientific Questions This workflow addresses critical scientific questions related to the analysis of phytosociological inventories: - Dominant Species Identification: Which species emerge as predominant in the phytosociological inventories, and what is their frequency of occurrence? - Altitudinal Distribution Patterns: How are plant communities distributed across altitudinal ranges, and are there discernible patterns? - Average Species Coverage Assessment: What is the average coverage of plant species, and how does it vary across different inventories? - Similarity in Inventory Content: How are phytosociological inventories grouped based on the similarity of their species content? - Species Interaction Dynamics: Which species exhibit notable interactive dynamics, and how can these interactions be visualized? - Fidelity Between Species: What is the likelihood that two or more species co-occur in the same inventory, and how does this fidelity vary across species pairs?

  • The workflow "Pollen Trends Analysis with AeRobiology" leverages the AeRobiology library to manage and analyze time-series data of airborne pollen particles. Aimed at understanding the temporal dynamics of different pollen types, this workflow ensures data quality, profiles seasonal trends, and explores temporal variations. It integrates advanced features for analyzing pollen concentrations and their correlation with meteorological variables, offering comprehensive insights into pollen behavior over time. The workflow enhances data accessibility, facilitating broader research and public health applications. Background In the dynamic landscape of environmental research and public health, the AeRobiology library (https://cran.r-project.org/web/packages/AeRobiology/index.html) emerges as a potent instrument tailored for managing diverse airborne particle data. As the prevalence of airborne pollen-related challenges intensifies, understanding the nuanced temporal trends in different pollen types becomes imperative. AeRobiology not only addresses data quality concerns but also offers specialized tools for unraveling intricate insights into the temporal dynamics of various pollen types. Introduction Amidst the complexities of environmental research, particularly in the context of health studies, the meticulous analysis of airborne particles—specifically various pollen types—takes center stage. This workflow, harnessing the capabilities of AeRobiology, adopts a holistic approach to process and analyze time-series data. Focused on deciphering the temporal nuances of pollen seasons, this workflow aims to significantly contribute to our understanding of the temporal dynamics of different airborne particle types. Aims The primary objectives of this workflow are tailored to address specific challenges and goals inherent in the analysis of time series pollen samples: - Holistic Data Quality Assurance: Conduct a detailed examination of time-series data for various pollen types, ensuring completeness and accuracy to establish a robust foundation for subsequent analysis. - Pollen-Specific Seasonal Profiling: Leverage AeRobiology's advanced features to calculate and visually represent key parameters of the seasonal trends for different pollen types, offering a comprehensive profile of their temporal dynamics. - Temporal Dynamics Exploration: Investigate the temporal trends in concentrations of various pollen types, providing valuable insights into their evolving nature over time. - Enhanced Accessibility: Employ AeRobiology's interactive tools to democratize the exploration of time-series data, making complex information accessible to a broader audience of researchers and professionals. Scientific Questions This workflow addresses critical scientific questions related to pollen analysis: - Distinct Temporal Signatures: What are the discernible patterns and trends in the temporal dynamics of different airborne pollen types, especially during peak seasons? - Pollen-Specific Abundance Variability: How does the abundance of various pollen types vary throughout their respective seasons, and what environmental factors contribute to these fluctuations? - Meteorological Correlations: Are there statistically significant correlations between the concentrations of different pollen types and specific meteorological variables, elucidating the influencing factors unique to each type? - Cross-Annual Comparative Analysis: Through the lens of AeRobiology, how do the temporal trends of different pollen types compare across different years, and what contextual factors might explain observed variations?

  • This workflow focuses on analyzing diverse soil datasets using PCA to understand their physicochemical properties. It connects to a MongoDB database to retrieve soil samples based on user-defined filters. Key objectives include variable selection, data quality improvement, standardization, and conducting PCA for data variance and pattern analysis. The workflow generates graphical representations, such as covariance and correlation matrices, scree plots, and scatter plots, to enhance data interpretability. This facilitates the identification of significant variables, data structure exploration, and optimal component determination for effective soil analysis. Background - Understanding the intricate relationships and patterns within soil samples is crucial for various environmental and agricultural applications. Principal Component Analysis (PCA) serves as a powerful tool in unraveling the complexity of multivariate soil datasets. Soil datasets often consist of numerous variables representing diverse physicochemical properties, making PCA an invaluable method for: ∙Dimensionality Reduction: Simplifying the analysis without compromising data integrity by reducing the dimensionality of large soil datasets. ∙Identification of Dominant Patterns: Revealing dominant patterns or trends within the data, providing insights into key factors contributing to overall variability. ∙Exploration of Variable Interactions: Enabling the exploration of complex interactions between different soil attributes, enhancing understanding of their relationships. ∙Interpretability of Data Variance: Clarifying how much variance is explained by each principal component, aiding in discerning the significance of different components and variables. ∙Visualization of Data Structure: Facilitating intuitive comprehension of data structure through plots such as scatter plots of principal components, helping identify clusters, trends, and outliers. ∙Decision Support for Subsequent Analyses: Providing a foundation for subsequent analyses by guiding decision-making, whether in identifying influential variables, understanding data patterns, or selecting components for further modeling. Introduction The motivation behind this workflow is rooted in the imperative need to conduct a thorough analysis of a diverse soil dataset, characterized by an array of physicochemical variables. Comprising multiple rows, each representing distinct soil samples, the dataset encompasses variables such as percentage of coarse sands, percentage of organic matter, hydrophobicity, and others. The intricacies of this dataset demand a strategic approach to preprocessing, analysis, and visualization. This workflow introduces a novel approach by connecting to a MongoDB, an agile and scalable NoSQL database, to retrieve soil samples based on user-defined filters. These filters can range from the natural site where the samples were collected to the specific date of collection. Furthermore, the workflow is designed to empower users in the selection of relevant variables, a task facilitated by user-defined parameters. This flexibility allows for a focused and tailored dataset, essential for meaningful analysis. Acknowledging the inherent challenges of missing data, the workflow offers options for data quality improvement, including optional interpolation of missing values or the removal of rows containing such values. Standardizing the dataset and specifying the target variable are crucial, establishing a robust foundation for subsequent statistical analyses. Incorporating PCA offers a sophisticated approach, enabling users to explore inherent patterns and structures within the data. The adaptability of PCA allows users to customize the analysis by specifying the number of components or desired variance. The workflow concludes with practical graphical representations, including covariance and correlation matrices, a scree plot, and a scatter plot, offering users valuable visual insights into the complexities of the soil dataset. Aims The primary objectives of this workflow are tailored to address specific challenges and goals inherent in the analysis of diverse soil samples: ∙Connect to MongoDB and retrieve data: Dynamically connect to a MongoDB database, allowing users to download soil samples based on user-defined filters. ∙Variable selection: Empower users to extract relevant variables based on user-defined parameters, facilitating a focused and tailored dataset. ∙Data quality improvement: Provide options for interpolation or removal of missing values to ensure dataset integrity for downstream analyses. ∙Standardization and target specification: Standardize the dataset values and designate the target variable, laying the groundwork for subsequent statistical analyses. ∙PCA: Conduct PCA with flexibility, allowing users to specify the number of components or desired variance for a comprehensive understanding of data variance and patterns. ∙Graphical representations: Generate visual outputs, including covariance and correlation matrices, a scree plot, and a scatter plot, enhancing the interpretability of the soil dataset. Scientific questions - This workflow addresses critical scientific questions related to soil analysis: ∙Facilitate Data Access: To streamline the retrieval of systematically stored soil sample data from the MongoDB database, aiding researchers in accessing organized data previously stored. ∙Variable importance: Identify variables contributing significantly to principal components through the covariance matrix and PCA. ∙Data structure: Explore correlations between variables and gain insights from the correlation matrix. ∙Optimal component number: Determine the optimal number of principal components using the scree plot for effective representation of data variance. ∙Target-related patterns: Analyze how selected principal components correlate with the target variable in the scatter plot, revealing patterns based on target variable values.

  • This workflow aims to analyze diverse soil datasets using PCA to understand physicochemical properties. The process starts with converting SPSS (.sav) files into CSV format for better compatibility. It emphasizes variable selection, data quality improvement, standardization, and conducting PCA for data variance and pattern analysis. The workflow includes generating graphical representations like covariance and correlation matrices, scree plots, and scatter plots. These tools aid in identifying significant variables, exploring data structure, and determining optimal components for effective soil analysis. Background Understanding the intricate relationships and patterns within soil samples is crucial for various environmental and agricultural applications. Principal Component Analysis (PCA) serves as a powerful tool in unraveling the complexity of multivariate soil datasets. Soil datasets often consist of numerous variables representing diverse physicochemical properties, making PCA an invaluable method for: ∙Dimensionality Reduction: Simplifying the analysis without compromising data integrity by reducing the dimensionality of large soil datasets. ∙Identification of Dominant Patterns: Revealing dominant patterns or trends within the data, providing insights into key factors contributing to overall variability. ∙Exploration of Variable Interactions: Enabling the exploration of complex interactions between different soil attributes, enhancing understanding of their relationships. ∙Interpretability of Data Variance: Clarifying how much variance is explained by each principal component, aiding in discerning the significance of different components and variables. ∙Visualization of Data Structure: Facilitating intuitive comprehension of data structure through plots such as scatter plots of principal components, helping identify clusters, trends, and outliers. ∙Decision Support for Subsequent Analyses: Providing a foundation for subsequent analyses by guiding decision-making, whether in identifying influential variables, understanding data patterns, or selecting components for further modeling. Introduction The motivation behind this workflow is rooted in the imperative need to conduct a thorough analysis of a diverse soil dataset, characterized by an array of physicochemical variables. Comprising multiple rows, each representing distinct soil samples, the dataset encompasses variables such as percentage of coarse sands, percentage of organic matter, hydrophobicity, and others. The intricacies of this dataset demand a strategic approach to preprocessing, analysis, and visualization. This workflow centers around the exploration of soil sample variability through PCA, utilizing data formatted in SPSS (.sav) files. These files, specific to the Statistical Package for the Social Sciences (SPSS), are commonly used for data analysis. To lay the groundwork, the workflow begins with the transformation of an initial SPSS file into a CSV format, ensuring improved compatibility and ease of use throughout subsequent analyses. Incorporating PCA offers a sophisticated approach, enabling users to explore inherent patterns and structures within the data. The adaptability of PCA allows users to customize the analysis by specifying the number of components or desired variance. The workflow concludes with practical graphical representations, including covariance and correlation matrices, a scree plot, and a scatter plot, offering users valuable visual insights into the complexities of the soil dataset. Aims The primary objectives of this workflow are tailored to address specific challenges and goals inherent in the analysis of diverse soil samples: ∙Data transformation: Efficiently convert the initial SPSS file into a CSV format to enhance compatibility and ease of use. ∙Standardization and target specification: Standardize the dataset and designate the target variable, ensuring consistency and preparing the data for subsequent PCA. ∙PCA: Conduct PCA to explore patterns and variability within the soil dataset, facilitating a deeper understanding of the relationships between variables. ∙Graphical representations: Generate graphical outputs, such as covariance and correlation matrices, aiding users in visually interpreting the complexities of the soil dataset. Scientific questions This workflow addresses critical scientific questions related to soil analysis: ∙Variable importance: Identify variables contributing significantly to principal components through the covariance matrix and PCA. ∙Data structure: Explore correlations between variables and gain insights from the correlation matrix. ∙Optimal component number: Determine the optimal number of principal components using the scree plot for effective representation of data variance. ∙Target-related patterns: Analyze how selected principal components correlate with the target variable in the scatter plot, revealing patterns based on target variable values.