<?xml version="1.0" encoding="UTF-8"?>
<gmd:MD_Metadata xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:srv="http://www.isotc211.org/2005/srv" xmlns:gmx="http://www.isotc211.org/2005/gmx" xmlns:gts="http://www.isotc211.org/2005/gts" xmlns:gsr="http://www.isotc211.org/2005/gsr" xmlns:gmi="http://www.isotc211.org/2005/gmi" xmlns:gml="http://www.opengis.net/gml/3.2" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
  <gmd:fileIdentifier>
    <gco:CharacterString>7b9f1a37-d6ec-4628-a0a5-994c2a865a3c</gco:CharacterString>
  </gmd:fileIdentifier>
  <gmd:language>
    <gmd:LanguageCode codeList="http://www.loc.gov/standards/iso639-2/" codeListValue="eng" />
  </gmd:language>
  <gmd:hierarchyLevel>
    <gmd:MD_ScopeCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#MD_ScopeCode" codeListValue="workflow" />
  </gmd:hierarchyLevel>
  <gmd:metadataStandardVersion>
    <gco:CharacterString>1.0</gco:CharacterString>
  </gmd:metadataStandardVersion>
  <gmd:identificationInfo>
    <gmd:MD_DataIdentification>
      <gmd:citation>
        <gmd:CI_Citation>
          <gmd:title>
            <gco:CharacterString>Phytosociological Inventory Import: ZIP with XMLs to Database</gco:CharacterString>
          </gmd:title>
          <gmd:date>
            <gmd:CI_Date>
              <gmd:date>
                <gco:DateTime>2023-12-31T00:00:00</gco:DateTime>
              </gmd:date>
              <gmd:dateType>
                <gmd:CI_DateTypeCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#CI_DateTypeCode" codeListValue="publication" />
              </gmd:dateType>
            </gmd:CI_Date>
          </gmd:date>
        </gmd:CI_Citation>
      </gmd:citation>
      <gmd:abstract>
        <gco:CharacterString>This workflow aims to streamline the integration of phytosociological inventory data stored in multiple XML files within a ZIP archive into a MongoDB database. This process is crucial for effective data management within the project's Virtual Research Environment (VRE). The workflow involves extracting XML files from the ZIP archive, converting them to JSON format for MongoDB compatibility, checking for duplicates to ensure data integrity, and uploading the data to increment the inventory count. This enhances the robustness and reliability of the inventory dataset for comprehensive analysis.&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;Background&lt;/div&gt;&lt;div&gt;Efficient data management is crucial in phytosociological inventories, necessitating seamless integration of inventory data. This workflow addresses a key aspect by facilitating the importation of phytosociological inventories stored in multiple XML files within a ZIP archive into the MongoDB database. This integration is vital for the project's Virtual Research Environment (VRE), providing a foundation for robust data analysis. The workflow comprises two essential components: converting XML to JSON and checking for inventory duplicates, ultimately enhancing the integrity and expansiveness of the inventory database.&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;Introduction&lt;/div&gt;&lt;div&gt;In phytosociological inventories, effective data handling is paramount, particularly concerning the integration of inventory data. This workflow focuses on the pivotal task of importing phytosociological inventories, stored in multiple XML files within a ZIP archive, into the MongoDB database. This process is integral to the VRE of the project, laying the groundwork for comprehensive data analysis. The workflow's primary goal is to ensure a smooth and duplicate-free integration, promoting a reliable dataset for further exploration and utilization within the project's VRE.&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;Aims&lt;/div&gt;&lt;div&gt;The primary aim of this workflow is to streamline the integration of phytosociological inventory data, stored in multiple XML files within a ZIP archive, into the MongoDB database. This ensures a robust and duplicate-free dataset for further analysis within the project's VRE. To achieve this, the workflow includes the following key components:&lt;/div&gt;&lt;div&gt;- ZIP Extraction and XML to JSON Conversion: Extracts XML files from the ZIP archive and converts each phytosociological inventory stored in XML format to JSON, preparing the data for MongoDB compatibility.
- Duplicate Check and Database Upload: Checks for duplicate inventories in the MongoDB database and uploads the JSON files, incrementing the inventory count in the database.&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;Scientific Questions&lt;/div&gt;&lt;div&gt;- ZIP Archive Handling: How effectively does the workflow handle ZIP archives containing multiple XML files with distinct phytosociological inventories?&lt;/div&gt;&lt;div&gt;- Data Format Compatibility: How successful is the conversion of XML-based phytosociological inventories to the JSON format for MongoDB integration?
- Database Integrity Check: How effective is the duplicate check component in ensuring data integrity by identifying and handling duplicate inventories?
- Inventory Count Increment: How does the workflow contribute to the increment of the inventory count in the MongoDB database, and how is this reflected in the overall project dataset?&lt;/div&gt;</gco:CharacterString>
      </gmd:abstract>
      <gmd:status>
        <gmd:MD_ProgressCode codeListValue="onGoing" codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#MD_ProgressCode" />
      </gmd:status>
      <gmd:pointOfContact>
        <gmd:CI_ResponsibleParty>
          <gmd:individualName>
            <gco:CharacterString>José Francisco Aldana Montes</gco:CharacterString>
          </gmd:individualName>
          <gmd:organisationName>
            <gco:CharacterString>University of Malaga</gco:CharacterString>
          </gmd:organisationName>
          <gmd:contactInfo>
            <gmd:CI_Contact>
              <gmd:address>
                <gmd:CI_Address>
                  <gmd:electronicMailAddress>
                    <gco:CharacterString>jfaldana@uma.es</gco:CharacterString>
                  </gmd:electronicMailAddress>
                </gmd:CI_Address>
              </gmd:address>
            </gmd:CI_Contact>
          </gmd:contactInfo>
          <gmd:role>
            <gmd:CI_RoleCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#CI_RoleCode" codeListValue="principalInvestigator" />
            <!-- Lifewatch - removed author value-->
          </gmd:role>
        </gmd:CI_ResponsibleParty>
      </gmd:pointOfContact>
      <gmd:pointOfContact>
        <gmd:CI_ResponsibleParty>
          <gmd:individualName>
            <gco:CharacterString>Francisco Manuel SÁNCHEZ-CANO</gco:CharacterString>
          </gmd:individualName>
          <gmd:organisationName>
            <gco:CharacterString>LifeWatch ERIC ICT Core</gco:CharacterString>
          </gmd:organisationName>
          <gmd:contactInfo>
            <gmd:CI_Contact>
              <gmd:address>
                <gmd:CI_Address>
                  <gmd:electronicMailAddress>
                    <gco:CharacterString>franciscom.sanchez@lifewatch.eu</gco:CharacterString>
                  </gmd:electronicMailAddress>
                </gmd:CI_Address>
              </gmd:address>
            </gmd:CI_Contact>
          </gmd:contactInfo>
          <gmd:role>
            <gmd:CI_RoleCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#CI_RoleCode" codeListValue="publisher" />
            <!-- Lifewatch - removed author-->
          </gmd:role>
        </gmd:CI_ResponsibleParty>
      </gmd:pointOfContact>
      <gmd:pointOfContact>
        <gmd:CI_ResponsibleParty>
          <gmd:individualName>
            <gco:CharacterString>Antonio José SÁENZ-ALBANÉS</gco:CharacterString>
          </gmd:individualName>
          <gmd:organisationName>
            <gco:CharacterString>LifeWatch ERIC ICT Core</gco:CharacterString>
          </gmd:organisationName>
          <gmd:contactInfo>
            <gmd:CI_Contact>
              <gmd:address>
                <gmd:CI_Address>
                  <gmd:electronicMailAddress>
                    <gco:CharacterString>aj.saenz@lifewatch.eu</gco:CharacterString>
                  </gmd:electronicMailAddress>
                </gmd:CI_Address>
              </gmd:address>
            </gmd:CI_Contact>
          </gmd:contactInfo>
          <gmd:role>
            <gmd:CI_RoleCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#CI_RoleCode" codeListValue="custodian" />
            <!-- Lifewatch - removed author-->
          </gmd:role>
        </gmd:CI_ResponsibleParty>
      </gmd:pointOfContact>
      <gmd:pointOfContact>
        <gmd:CI_ResponsibleParty>
          <gmd:individualName>
            <gco:CharacterString>ICT Core Group</gco:CharacterString>
          </gmd:individualName>
          <gmd:organisationName>
            <gco:CharacterString>LifeWatch ERIC ICT Core</gco:CharacterString>
          </gmd:organisationName>
          <gmd:contactInfo>
            <gmd:CI_Contact>
              <gmd:address>
                <gmd:CI_Address>
                  <gmd:electronicMailAddress>
                    <gco:CharacterString>ict.coordination@lifewatch.eu</gco:CharacterString>
                  </gmd:electronicMailAddress>
                </gmd:CI_Address>
              </gmd:address>
            </gmd:CI_Contact>
          </gmd:contactInfo>
          <gmd:role>
            <gmd:CI_RoleCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#CI_RoleCode" codeListValue="principalInvestigator" />
            <!-- Lifewatch - removed author-->
          </gmd:role>
        </gmd:CI_ResponsibleParty>
      </gmd:pointOfContact>
      <gmd:descriptiveKeywords>
        <gmd:MD_Keywords>
          <gmd:keyword>
            <gco:CharacterString>Phytosociological inventory</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>ZIP archive</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>XML files</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>JSON conversion</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>MongoDB database</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>Data integration</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>Duplicate check</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>Data integrity</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>Virtual Research Environment (VRE)</gco:CharacterString>
          </gmd:keyword>
          <gmd:keyword>
            <gco:CharacterString>Inventory count increment</gco:CharacterString>
          </gmd:keyword>
        </gmd:MD_Keywords>
      </gmd:descriptiveKeywords>
      <gmd:resourceConstraints>
        <gmd:MD_LegalConstraints>
          <gmd:accessConstraints>
            <gmd:MD_RestrictionCode codeListValue="copyright" codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#MD_RestrictionCode" />
          </gmd:accessConstraints>
          <gmd:useLimitation gco:nilReason="missing">
            <gco:CharacterString />
          </gmd:useLimitation>
          <gmd:otherConstraints>
            <gco:CharacterString>Copyright 2023 Khaos Research Group</gco:CharacterString>
          </gmd:otherConstraints>
        </gmd:MD_LegalConstraints>
      </gmd:resourceConstraints>
    </gmd:MD_DataIdentification>
  </gmd:identificationInfo>
  <gmd:workflow>
    <gmd:LW_Workflow>
      <gmd:containServices_workflow>
        <gmd:LW_WorkflowContainServices>
          <gmd:serviceName_workflow>
            <gco:CharacterString>Import file ZIP</gco:CharacterString>
          </gmd:serviceName_workflow>
          <gmd:serviceDescription_workflow>
            <gco:CharacterString>Extract XML files from the provided ZIP archive</gco:CharacterString>
          </gmd:serviceDescription_workflow>
          <gmd:serviceReference_workflow>
            <gco:CharacterString>https://gitlab.lifewatch.dev/lfw002-khaos/wrapper-library/-/tree/develop/core/ImportFile/0.0.5</gco:CharacterString>
          </gmd:serviceReference_workflow>
        </gmd:LW_WorkflowContainServices>
      </gmd:containServices_workflow>
      <gmd:containServices_workflow>
        <gmd:LW_WorkflowContainServices>
          <gmd:serviceName_workflow>
            <gco:CharacterString>ZIP XML to JSON</gco:CharacterString>
          </gmd:serviceName_workflow>
          <gmd:serviceDescription_workflow>
            <gco:CharacterString>Convert each extracted XML file, which contains phytosociological inventory data, into JSON format for compatibility with MongoDB. Check the JSON data against existing entries in the MongoDB database to identify and handle any duplicates, ensuring data integrity.</gco:CharacterString>
          </gmd:serviceDescription_workflow>
          <gmd:serviceReference_workflow>
            <gco:CharacterString>https://gitlab.lifewatch.dev/lfw002-khaos/wrapper-library/-/tree/develop/data-processing/Zipxml2json/1.0.0</gco:CharacterString>
          </gmd:serviceReference_workflow>
        </gmd:LW_WorkflowContainServices>
      </gmd:containServices_workflow>
      <gmd:containServices_workflow>
        <gmd:LW_WorkflowContainServices>
          <gmd:serviceName_workflow>
            <gco:CharacterString>Import to DB</gco:CharacterString>
          </gmd:serviceName_workflow>
          <gmd:serviceDescription_workflow>
            <gco:CharacterString>Upload the verified, duplicate-free JSON data into the MongoDB database, updating the inventory count.</gco:CharacterString>
          </gmd:serviceDescription_workflow>
          <gmd:serviceReference_workflow>
            <gco:CharacterString>https://gitlab.lifewatch.dev/lfw002-khaos/wrapper-library/-/tree/develop/data-sink/Json2db/1.0.0</gco:CharacterString>
          </gmd:serviceReference_workflow>
        </gmd:LW_WorkflowContainServices>
      </gmd:containServices_workflow>
      <gmd:workflowOtherInformation_workflow>
        <gmd:LW_workflowOtherInformation>
          <gmd:workflowHelpdesk_workflow>
            <gco:CharacterString>https://helpdesk.lifewatch.eu</gco:CharacterString>
          </gmd:workflowHelpdesk_workflow>
        </gmd:LW_workflowOtherInformation>
      </gmd:workflowOtherInformation_workflow>
    </gmd:LW_Workflow>
  </gmd:workflow>
  <gmd:distributionInfo>
    <gmd:MD_Distribution>
      <gmd:transferOptions>
        <gmd:MD_DigitalTransferOptions>
          <gmd:onLine>
            <gmd:CI_OnlineResource>
              <gmd:linkage>
                <gmd:URL />
              </gmd:linkage>
              <gmd:protocol>
                <gco:CharacterString>DOI</gco:CharacterString>
              </gmd:protocol>
              <gmd:applicationProfile gco:nilReason="missing">
                <gco:CharacterString />
              </gmd:applicationProfile>
              <gmd:name gco:nilReason="missing">
                <gco:CharacterString />
              </gmd:name>
              <gmd:description gco:nilReason="missing">
                <gco:CharacterString />
              </gmd:description>
              <gmd:function>
                <gmd:CI_OnLineFunctionCode codeList="http://standards.iso.org/iso/19139/resources/gmxCodelists.xml#CI_OnLineFunctionCode" codeListValue="" />
              </gmd:function>
            </gmd:CI_OnlineResource>
          </gmd:onLine>
        </gmd:MD_DigitalTransferOptions>
      </gmd:transferOptions>
    </gmd:MD_Distribution>
  </gmd:distributionInfo>
</gmd:MD_Metadata>

