bdc R package - Biodiversity Data Cleaning
The package brings together several aspects of biodiversity data cleaning in one place. 'bdc' is organized in thematic modules related to different biodiversity dimensions, including: 1) Merge datasets: standardization and integration of different datasets; 2) Pre-filter: flagging and removal of invalid or non-interpretable information, followed by data amendments; 3) Taxonomy: cleaning, parsing, and harmonization of scientific names from several taxonomic groups against taxonomic databases locally stored through the application of exact and partial matching algorithms; 4) Space: flagging of erroneous, suspect, and low-precision geographic coordinates; and 5) Time: flagging and, whenever possible, correction of inconsistent collection date.
In addition, it contains features to visualize, document, and report data quality – which is essential for making data quality assessment transparent and reproducible. The reference for the methodology is Bruno et al. (2022) https://doi.org/10.1111%2F2041-210X.13868
Default
Identification
- Date ( Publication )
- 2023-03-13
- Status
- Completed
- Version
- 1.1.4
- Keywords
- data cleaning
- Keywords
- data standardization
- Keywords
- data integration
- Keywords
- data quality
- Keywords
- biodiversity data
- Keywords
- scientific name harmonization
- Keywords
- error detection
- Keywords
- R
- Access constraints
- License
- Use limitation
- https://cran.r-project.org/web/licenses/GPL-3
- OnLine resource
- CRAN link ( WWW:LINK-1.0-http--link )
- OnLine resource
- Development site ( WWW:LINK-1.0-http--link )
- Service Category
- data collection and preparation
- Service Category
- data cleaning
- Service Language
- eng
- Service TRL
- TRL 9 – Actual system proven in operational environment
- Service Training
- https://doi.org/10.1111/2041-210X.13868
- Service User Manual
- https://cran.r-project.org/web/packages/bdc/bdc.pdf