• Metadata Catalogue
  •   Search
  •   Map

Phytosociological Inventory Import: Word to Database

This workflow aims to streamline the integration of phytosociological inventory data from Word documents (.docx) into a MongoDB database. This process is essential for the project's Virtual Research Environment (VRE), facilitating robust data analysis. Key components include converting Word documents to JSON format, checking for duplicate inventories to ensure data integrity, and uploading the JSON files to the database. This workflow ensures a reliable, comprehensive dataset for further exploration and utilization within the VRE, enhancing the project's inventory database.<div><br></div><div>Background</div><div>Efficient data management is crucial in phytosociological inventories, necessitating seamless integration of inventory data. This workflow addresses a key aspect by facilitating the importation of phytosociological inventories stored in Word documents (.docx) into the MongoDB database. This integration is vital for the project's Virtual Research Environment (VRE), providing a foundation for robust data analysis. The workflow comprises two essential components: converting Word to JSON and checking for inventory duplicates, ultimately enhancing the integrity and expansiveness of the inventory database.</div><div><br></div><div>Introduction

In phytosociological inventories, effective data handling is paramount, particularly concerning the integration of inventory data. This workflow focuses on the pivotal task of importing phytosociological inventories, stored in Word documents (.docx), into the MongoDB database. This process is integral to the VRE of the project, laying the groundwork for comprehensive data analysis. The workflow's primary goal is to ensure a smooth and duplicate-free integration, promoting a reliable dataset for further exploration and utilization within the project's VRE.</div><div><br></div><div>Aims</div><div>The primary aim of this workflow is to streamline the integration of phytosociological inventory data, stored in Word documents (.docx), into the MongoDB database. This ensures a robust and duplicate-free dataset for further analysis within the project's VRE. To achieve this, the workflow includes the following key components:

- Word to JSON Conversion: Converts phytosociological inventories stored in Word documents (.docx) to JSON, preparing the data for MongoDB compatibility.

- Duplicate Check and Database Upload: Checks for duplicate inventories in the MongoDB database and uploads the JSON files, incrementing the inventory count in the database.</div><div><br></div><div>Scientific Questions</div><div>- Word Document Parsing: How effectively does the workflow parse and extract phytosociological inventories from Word documents (.docx)?</div><div>- Data Format Compatibility: How successful is the conversion of Word-based phytosociological inventories to the JSON format for MongoDB integration?

- Database Integrity Check: How effective is the duplicate check component in ensuring data integrity by identifying and handling duplicate inventories?

- Inventory Count Increment: How does the workflow contribute to the increment of the inventory count in the MongoDB database, and how is this reflected in the overall project dataset?</div>

Default

Date ( Publication)
2023-12-31T00:00:00
Status
On going / operational
Principal investigator
  University of Malaga - José Francisco Aldana Montes

Publisher
  LifeWatch ERIC ICT Core - Francisco Manuel SÁNCHEZ-CANO

Custodian
  LifeWatch ERIC ICT Core - Antonio José SÁENZ-ALBANÉS

Principal investigator
  LifeWatch ERIC ICT Core - ICT Core Group

Keywords

Phytosociological inventory

Keywords

Word documents (.docx)

Keywords

JSON conversion

Keywords

MongoDB database

Keywords

Data integration

Keywords

Duplicate check

Keywords

Data integrity

Keywords

Virtual Research Environment (VRE)

Keywords

Inventory count increment

Keywords

Data parsing

Access constraints
Copyright
Other constraints

Copyright 2023 Khaos Research Group

Protocol

DOI

Service Name

Import file Word

Service Description

Extract phytosociological inventory data from Word documents (.docx).

Service Reference (id)

https://gitlab.lifewatch.dev/lfw002-khaos/wrapper-library/-/tree/develop/core/ImportFile/0.0.5

Service Name

Word to JSON

Service Description

Convert the extracted data from Word format to JSON format to ensure compatibility with MongoDB. Check the JSON data for duplicates against existing entries in the MongoDB database to maintain data integrity.

Service Reference (id)

https://gitlab.lifewatch.dev/lfw002-khaos/wrapper-library/-/tree/develop/data-processing/Word2json/1.0.0

Service Name

Import to DB

Service Description

Upload the validated, duplicate-free JSON data into the MongoDB database, updating the inventory count.

Service Reference (id)

https://gitlab.lifewatch.dev/lfw002-khaos/wrapper-library/-/tree/develop/data-sink/Json2db/1.0.0

Workflow Helpdesk

https://helpdesk.lifewatch.eu

Metadata

File identifier
c4a313ef-c5a6-4b54-b54a-7ff25c8d8318 XML
Metadata language
en
Hierarchy level
Workflow
Metadata Schema Version

1.0

 
 

Overviews

Spatial extent

Keywords



Provided by

logo
Access to the portal
Read here the full details and access to the data.