From 1 - 3 / 3
  • Automated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. The package includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore, it identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also, it implements an algorithm to identify datasets with a significant proportion of rounded coordinates. It is especially suited for large datasets. The reference for the methodology is: Zizka et al. (2019) https://doi.org/10.1111%2F2041-210X.13152

  • A set of functions for error detection and correction in point data quality datasets that are used in species distribution modelling. Includes functions for parsing and converting coordinates into decimal degrees from various formats.

  • The package brings together several aspects of biodiversity data cleaning in one place. 'bdc' is organized in thematic modules related to different biodiversity dimensions, including: 1) Merge datasets: standardization and integration of different datasets; 2) Pre-filter: flagging and removal of invalid or non-interpretable information, followed by data amendments; 3) Taxonomy: cleaning, parsing, and harmonization of scientific names from several taxonomic groups against taxonomic databases locally stored through the application of exact and partial matching algorithms; 4) Space: flagging of erroneous, suspect, and low-precision geographic coordinates; and 5) Time: flagging and, whenever possible, correction of inconsistent collection date. In addition, it contains features to visualize, document, and report data quality – which is essential for making data quality assessment transparent and reproducible. The reference for the methodology is Bruno et al. (2022) https://doi.org/10.1111%2F2041-210X.13868